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r~| . This paper deals with the asymptotic distribution of Wishart matrix and its apphcation 

\ to the estimation of the population matrix parameter when the population eigenvalues are 

block-wise infinitely dispersed. We show that the appropriately normalized eigenvectors 
and eigenvalues asymptotically generate two Wishart matrices and one normally distributed 
random matrix, which are mutually independent. For a family of orthogonally equivariant 
^ ' estimators, we calculate the asymptotic risks with respect to the entropy or the quadratic 

■ loss function and derive the asymptotically best estimator among the family. We numerically 

. show 1) the convergence in both the distributions and the risks are quick enough for a 

0^ \ practical use, 2) the asymptotically best estimator is robust against the deviation of the 

' population eigenvalues from the block-wise infinite dispersion. 
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1 Introduction 



^ ■ Suppose that a p-dimensional random vector y has the covariance matrix S. The inference for 
■ - ' I] has been studied in enormous amount of literature and is still an important topic from both 
theoretical and practical points of view. Often we assume some structure of S, i.e., restriction 
on its parameter space {S | S > 0}. A structure, in some cases, arises from a theoretical reason 
behind the data. In other cases, it appears as a result of exploratory analysis such as principle 
component analysis or exploratory factor analysis. 

For example suppose that y is generated in the following multivariate linear model; 

y = Bx + e, (1) 

where B is a pxm coefficient (factor loading) matrix with rank B = m, x is a latent m x 1 random 
vector (common factor) and p x 1 vector e is an error term (unique factor) which is independently 
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distributed from x. If we further assume that e has cr^/p {Ip. p-dimensional identity matrix) as 
its covariance matrix, S is written as 

where S^. is the nonsingular covariance matrix of cc. In this case S has the eigenvalues Ai > • • • > 
Ap given by 

A. = |V'' ij^ = l'---'^' (2) 
[ cr^, if z = m + 1, . . . ^ 

where > 0, i = 1, . . . , m, are the eigenvalues of BH^B' . It is often observed that o"^ is quite 
small compared to r^'s, which means that the first group of eigenvalues (Ai, . . . , A^) is very large 
compared to the second group (A^+i, . . . , \p). In this paper we call this state as "(two-) block- wise 
dispersion" of the population eigenvalues. 

What would happen to the sample covariance matrix, when the eigenvalues of population 
covariance matrix are "infinitely" dispersed? This is an interesting question from a theoretical 
standpoint. Takemura and Sheena (2005) and Sheena and Takemura (2006) deal with this problem 
under "total dispersion" of population eigenvalues, namely 

(A2/A1, A3/A2, . . . , Xp/Xp-i) —>■ 0. 

This paper is a generahzation of Takemura and Sheena (2005) from a theoretical point of view, 
while the practical motivation is as follows; as we saw above, we often come across a practical 
situation where the population eigenvalues are block-wise dispersed. It is helpful for the infer- 
ence on S in practical situations to understand the behavior of the sample covariance matrix, 
when the population eigenvalues are block-wise "infinitely" dispersed. The state of the popula- 
tion eigenvalues being infinitely dispersed is a theoretical approximation, but understanding the 
limiting behavior leads to a better insight on its neighborhood where the eigenvalues are "largely" 
dispersed. 

Now we formally state the framework of this paper. Let S = (s.^) be distributed according to 
Wishart distribution Wp{n, S), where p is the dimension, n is the degrees of freedom, and S is 
the covariance matrix. The spectral decompositions of S and S are given by 

S = TAr', S - GLG', 

where G, F e 0{p), the group of p x p orthogonal matrices, and A = diag(Ai, . . . , Ap), L — 
diag(/i, . . . ,lp), are diagonal matrices with the eigenvalues Ai > . . . > Ap > 0, /i > . . . > /p > 
of S and S, respectively. We use the notations A = (Ai, . . . , Ap) and I — {h, . . . ,lp) hereafter. By 
the requirement that 

G = (g,,) = T'G 

has positive diagonal elements, the spectral decomposition S = GLG' is almost surely uniquely 
determined. Then almost surely there exists a one-to-one correspondence between the set {S \ 
5 > 0} and £ X 0+{p), where 

jC^{l\h> ■■■>lp>0}, 0+{p) = {G e Oip) \gii>0, l<i< p}. 
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Let m {mi in Subsection 2.3) denote the dividing point of the first block and the second block 
of the eigenvalues. Now we parameterize A,Z as follows; 



f ^jCt, if i = 1, . . . ,m, 
\ CiP, if i = m + l,...,p, 



(3) 




dia, if i = 1, . . . , m, 
dif3, if i = m + 1, . . . , j9, 



(4) 



In this paper we always consider ^'s are given and fixed. We also use the notations, 



S = diag(6,...,,^p), 



D = diag(rfi, . . .,dp), 



d = {di, . . . , dp). 



Wc will investigating the asymptotic distribution of S as (3 /a goes to while H is fixed and its 
application to the estimation of S. The state (3/a means that the eigenvalues of S are two- 
block- wise "largely" dispersed. In the following, the notation /3/q; — > means a limiting operation 
n — >■ oo with arbitrary sequences n = 1, 2, . . ., such that fin/an ^ 0. 

We briefly describe the content of the following sections. In Subsection 2.1 we prepare a 
local coordinate system of 0^{p) around Ip. In Subsection 2.2 we present our main results on 
asymptotic distributions and we further discuss the case of multi-block- wise infinite dispersion in 
Subsection 2.3. Section 3 deals with the estimation of S from decision-theoretic framework. In 
Subsection 3.1 we introduce orthogonally cquivariant estimators and two loss functions and in 
Subsection 3.2 we calculate the asymptotic risks. Wc concentrate on the special case of block-wise 
identity covariance matrices in Subsection 3.3, which is practically important, and we propose the 
best estimator for the case with respect to each loss function. In Subsection 3.4 the convergence 
speed of both distributions and risks are numerically evaluated. Together with the application to 
discriminant analysis, the numerical comparisons show the superiority of the new estimators. In 
Appendix we present the proofs of two lemmas and discuss analytical calculation of the asymptotic 
risks. 

Before concluding this subsection, we introduce some notational conventions in this paper. In 
the sections other than Subsection 2.3, we always consider a same two-block partition of matrices. 
For A — {ttij), a p X p matrix, Aij (1 < i,j < 2) denotes the (i, j)-block in the partition 



For the particular case of diagonal matrix A — diag(ai, . . . , Op), we simply write Ai, A2 instead 
of An, A22, i.e. Ai = diag(ai, . ..,am), A2 = diag(a„+i, . . . , Op). Let a = (aij)i<i<i<p denote 
the vector of the elements in the lower triangular part of A, which is correspondingly partitioned 
as a = (an, a22, 021), where 




All m X m, A22 : ip — m) X {p — m). 



If A is block diagonal, i.e. A12 = A21 = 0, we write 
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If a is a p-dimensional row vector, i.e., a = (ai, . . . , a^), then we make a partition of a as 

a=(ai,a2), ai = (oi, . . . , a^), a2 = (a^+i, . . . , Op). 
We write etr X — exp(trX) for a square matrix X. 

2 Asymptotic Distribution 
2.1 Local Coordinates 

We consider a local coordinate of u — {uij)i<j^i<p, around the identity matrix Ip. For the 

proof of the existence of such coordinate, see Appendix B of Takemura and Sheena (2005). We 
have the following open sets Ce,U,V and functions (pij, 1 < i < j < p; 

Ce^{u\ \uij\ <e,l<j<i<p}G RP(P-^y^, 

and 4>ij{u) is a C°° function on Cg such that G{u) = {gijiu)) defined by 

f 9ij{u) = (t>ij{u), l<i<j<p, 
1 9ij{'^) = Uij, l<j<i<p, 

is a one-to-one function from U onto V. Using V we can construct a finite open covering of 0~^{p) 
as follows. For Hi e 0+{m), eO+{p-m), let 

V{Hi, H2) = diag(/fi, H2)V nO+{p) = {G\G = dieig{Hi, H2)G* , 3G* eV}n 

denote the open neighborhood of diag(i/i, H2). Let 

0{m,p - m) = {diag(lJ'i, H2) \ Hi e 0+(m), H2 e 0+{p - m)} 

then 

0{m,p-m)c U V{Hi,H2). 

Hi eO+{m),H2eO+ (p-m) 

Since 0{m,p — m) is compact, we can choose a finite number of sets O^"^) = y(i3'i^\ 
T = 1, . . . ,r, such that Ur=i O^^^ ^ 0{m,p - m). Let 0(°) = 0+{p) \ 0{m,p - m), then we 
have a finite open covering {0*^^''}^=o ^^{p}- We denote the partition of unity subordinate to 
{0*^'^^}!^^Q by {Lr}r=o- Namely for each r, ir is a continuous function from 0~^{p) to [0,1], the 
support of Lr is contained in 0^'^\ and J2^=o i^riG) = 1. 

For 0^^\ 1 < T <T, we can use w as a local coordinate since G in O^"^) can be uniquely 
expressed as H'^'^^G{u) with some u in [/, where 

= diag(i?r\ i?^^^ r = 1, . . . , T. (6) 

As we will see later, we do not need a local coordinate for 0^^\ since the measure of this area 
asymptotically vanishes. 
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Now we have (Z, u) as a local coordinate on each C x 0^'^\ r = 1, . . . , T. We need another 
local coordinate to investigate the asymptotic behavior of S. Let q — {qij)i<j<i<p be defined as 
follows as another coordinate on O^"^) for a fixed r, r = 1, . . . , T; if 1 < j < m < i < p, 



Qij — \ 2-^ l-"2 )i-m,t-m Utj 

t=m+l 

_ ^1/2^-1/2 J l/2£.-l/2 (lJ-ir)\ 
— P Qi 2^ [J^2 )i-m,t-m Utj 

t=m+l 



(7) 



and qij = Uij otherwise. If we use matrices Q = (%), U = {uij) and their partitions, (7) is the 
same as 

Q2I = aV2/3-V23-l/2jf(r)f;-^^^l/2^ ^ ^^^^ ^ 



Conversely 



or 



p 

t=m+l 



-1/2^1/2 ^ {H^;\_^^^_^q,^i]l^df\ ifl<i<m<i<p, 



(8) 
(9) 

(10) 



otherwise. 



Pairing q — {qij)i<j<i<p with d = (di, . . . , dp), we have another local coordinate (d, qr) on D x O*^'^^ 
where 

I? = (Di X D2) n 2)3 (11) 

with 

X>i = {di I (ii > • • • > d„ > 0} 
T>2 = {^2 I cim+i > ■■■>dp>0} 
^3 = {(cii,d2) I dm/dm+1 > PI a). 

The Jacobian of the transformation J((Z, u) — > (cZ, qr)) is given by 



det 



j<m<i 

'-0' 



m(p — m) . mip — 'tn) 



(p-m) P 



n 

i=m+l 



(12) 



2.2 Main Results 

The following theorem says that G asymptotically separates into two orthogonal matrices Gn, G22 
on the diagonal blocks. 

Theorem 1 

1 As (3/ a 0, G21 ^ 0. 

2 lim^/Q,_>o P(G e O) = 1 for any open set O C 0^{p) including 0{m^p — m). 
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Proof. Since 2 is easily proved from 1, we only prove 1 here. Let 

S = (sij) = A-5r'5rA-5 = A-^GLG'A-^ - Wp(n,/p), 
Suppose 1 < 3 <m < i <p. Note that 

Sii = (^a^i H 

Therefore 

i/ij *ii , —Oil ^ 6m . (^lo; 

Since Su is distributed independently of S, for any e > 0, there exists M such that 

P(su < M) > 1 - e, VS. (14) 

Besides, from the result of Lemma 1 of Takemura & Sheena (2005), for any e > 0, there exists C 
such that 

p(^ <c] >l-e, VE. (15) 



Prom (14) and (15) we have 



Prom this fact and (13) we have 



s,,^^^0 as ^^0. 
Ij a a 



g^, ^0 as — — >• 0, 1 < Vj < m < Vi < p. 



Next we state a rather technical lemma, which will be used in the proofs of some theorems. 
Consider a random variable x{G, I, A, a, (3). We are often interested in the asymptotic expectation 
of x{G, I, A, a, 13) as 13/a while T is fixed. Por fixed T and If = diag(i/{"\ if^"^), if{"^ e 
C"'"(m), ilfs^^ e C'"'"(p — fn)^ somewhat abusing the notation, let 

x(d, q, ^, a, /3; T, iJ«) = x{TH^-^G{u{d, q, |, a, /3)), l{d, a, /?), A(|, a, P),a, (5) (16) 

for emphasizing the right-hand side as the function of (d, q, ^, a, /3), where G(w), u(<i, q, ^, a, /3), 
l{d,a,P), A(^,q;, /3) are respectively defined by (5), (10), (4) and (3). Por u — (1111,1*22,1*21), we 
have 

lim u{d, q, a, (5) = lim (wii(qrii), 1*22(922), W2i(d, ^, «, /?)) = (^ii, ^22, 0), (17) 



hence 



lim G(w(d, q, t «, /?)) = G{qn, ^22, 0). (18) 

p/a— >0 
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Lemma 1 Suppose that there exist some a < 1/2 and b > such that 

\x{TG,l,X,a,(3)\<betr{aGLG'A-^) a.e.in{G,l) (19) 

and suppose that for each r, t — 1, . . . ,T, lim^/a^o x{d,, q, ^, a, (3\ F, H^'^^) exists and equals to a 
function 

xr{H^^^G{qi,, q22, 0), d, Q21, 0- (20) 

Then 

hm E[x{G,l,X,a,P)] (21) 

p/a— >0 

= £;[xr(diag(Gn(Wn), G22(W22)), {d,{Wn), cZ2(W22)), Z21, |)], 

where the expectation on the right side of (21) is taken with respect to the following mutually 
independent distributions 

Wn ~ W^(n, Hi), 

W22 ~ Wp_™(n-m,H2), (22) 

Z21 ~ -^(p-m)xm(O, ip-m ® Im), 

and Gss{Wss), ds{Wss), s — 1,2, are the components in the unique spectral decomposition ofWss 
for 1,2; 

Wn^GnDiG[^, A = diag(di, . . . , d^), di = (di, . . . , d^), 
W22 ^ G22D2G'22, -D2 = diag(d„+i, . . . ,dp), d2 ^ (dm+i, ■ ■ ■ ,dp). 

The proof is given in Appendix. 

The following theorem on the asymptotic distributions is actually a corollary of Lemma 1. Let 

W22 — G22D2G'22i 

Z2, = a''^p-''-^-2'"G2rD\'\ 
where all the elements on the right-hand side are defined in Section 1. 

Theorem 2 As 

VTii^ W^(n,Si), 
W22 Wp-rn{n - m, S2), 

Z21 ^ -^(p— m)xm(0, Ip—m ® Im) 

and Wii, W22, Z21 are asymptotically mutually independently distributed. 
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Proof. Let ©n : m x m symmetric matrix, ©22 : {p — m) x {p — m) symmetric matrix and 
©21 : m X (p — m) matrix. Consider the moment generating function 

x{G,l,\,a,(3) = exp(trWn©ii + trW22©22 + trZ2i©2i) 

2 _ 

= exp (5]trW,,©,, + trZ2i©2i). 

s=l 

For if(^) = diag{H[^\ H^^^), h[^^ e 0+{m), H^^^ eO+{p-m), we have 

x{TH^^^G{u),l,X,a,P) = exp{^ tr(if(^)G(w)),, A(if(^)G(u));,©3. 

s=l 

+ traV2/5-V2H-V2(ji-(r)Gf(„))^^£,i/20^^|_ 

Prom (5) 

(/f(-)GH)2i=JTi"^C/2i, 

hence from (8) 

aV2/3-V2H-V2(JfMG(n))2iI?l/^ = Q21. 

This leads to 

x{d, q, a, /3; T, H^^^) = exp{f^ ti{H^^~^G{u))ssDs{H^^^G{u))',,@ss + tr Q2i©2i}, 

s=l 

with u = u{d, q, ^, a, j3). Therefore from (18) 
hm x(d,q,ta,/3;r,i/W) 

p/a— »0 

2 

= exp{^ tr(ifMG(qn, 922, 0)),,£),(i?(^)G(qn, ^22, 0));,©,, + tr Q2i©2i}. 



Prom Lemma 1, 



hm E[exp(tr W"ii©ii + tr VF22©22 + tr Z2i©2i)] 

/3/a— »0 

= E[exp{^trG,,(W,,)A(l^,,)G,,(W,,)'©,, + trZ2i©2i}] 

s=l 

= £;[etr Wii0ii]£;[etr W22©22]£^[etr Z2i©2i], 

where in the second and third equations the expectations are taken with respect to the distributions 
(22) in Lemma 1. I 

2.3 Multi-block Partition 

In this section, we extend Theorem 2 into multi-block cases. We partition (1, . . . ,p) into k blocks; 

1st block (mo + 1, • • • , mi), 
2nd block (mi + 1, . . . , m2), 

/cth block (mjt_i + 1, • • • , mfe), 
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where 

mo = < mi < m.2 < ■ • ■ < m^ = p. 
Let [i], i = 1, . . . ,p, denote the block containing i, i.e., 

[i] — s, if nis-i + 1 <i < rUs- 

We also use the notations ffig — mg — rUs-i, s — 1, . . . ,k, for the block sizes. 

Correspondingly to the above partition, we make the following partition of a p x p matrix 

(aij); 

/All • • • Aik \ 
A = '■ ■ • . '■ , Ast : fhs X fht matrix, 1 < s,t < k. 
\ Aki ■ ■ ■ Akk J 
For a diagonal matrix A = diag(ai, . . . , Op), we use the notation 

f Ai \ 



diag(a^^_i+i, . . . , a^J, s^l,...,k. 



V Ak J 

Consider the following parametrization oi I, \ 



k = dia[ii, l<i<p, 
In this subsection we again consider that ^j's are fixed. Now we define Wgs, Zgt, 1 < t < s < k; 

'^^ss GssDgG ssi 

where notations of the right-hand side are defined in Section 1. The following theorem is the 
extension of Theorem 2. 

Theorem 3 As {a^joix, Oizjoti, • • • , oikloLk-\) 0, 



Wc,^(n -ms_i,Ss), \<s<k, 



(0,I^^®J^J, l<t<s<k, 



and Wss(l < s < k), Zst{l <t<s<k) are asymptotically mutually independently distributed. 

Proof. Though wc can prove the theorem in the same manner as the proof of Theorem 2, it is 
notationally too cumbersome. Instead we will prove the theorem by using Theorem 2 recursively. 
Let Ti = ai and Vt = at/at-i, t = 2, . . . k, then nt=i = C(s, s = 1, . . . , k. Note for 1 < i < p. 
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We consider the moment generating function 



E 



exp (tr Wss&ss + tr ^ ZstQst) 



s=l 



l<t<s<k 



where 0ss(l ^ s < k) and 0st(l < t < s < k) are respectively a x symmetric matrix and 
a fht X fhs matrix. We have 



Um E 

(a2/ai,...,afc/as._i)^0 



hm E 



hm ■ ■ ■ hm E 

r2^0 ri;—>0 



exp(tr5^W,,e,, + tr ^ Z,tQ,t) 

s=l l<t<s<k 
k 

exp(tr^W,,e,, + tr ^ Z,tQst) 

s=l l<t<s<k 

k _ 

exp(tr Wss@ss + tr ^ Z,*© 



s=l 



l<t<s<k 



We omit technical arguments on uniform convergences, which guarantees the decomposition of 



lim 



('"2, 



+0 in the second line into step by step limiting operations limr2_»o • • • linirj.-»o in the 



third line. 

Consider the partitions; 



G 



/ G 



where G^^'^^ 
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Gik-i ^ 
Gk-ik-i ) 



Define D*, S* as partitioned matrices; 

X(fc-i) 











where 



L^'-'^ = diag(ii, . . . , A('=-^) = diag(Ai, . . . , A^,_J. 

Let a = 1, /3 = ccfe = U.t=i n- Then 





DkP 



A(^-i)a 




Since asrfe^0,/9/Q;^0, from Theorem 2, we have 

Sik-i) ^ Q{k-i)j^{k-i)Q(k-i)> 4 w-^^_^(^, A(^-i)), 
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and the asymptotic distributions are mutually independent. Therefore 



lim E 



exp (tr J2 WssSss + tr ^ ZstSst) 



s=l 

fc-1 



l<t<s<k 



= E 



exp(tr J2 Wss{S^''~'^)@ss + tr ^ Z,t{S^''-''>)@,t 

l<t<s<k-l 



s=l 



k-l 



XE 



etr Wkk&kk X Yl E^^eti ZktQkt 



t=i 



where the expectations on the right-hand side is taken with respect to the above asymptotic dis- 
tributions. If we apply Theorem 2 again to 5^^^"^^ and recursively to the upper-left block Wishart 
distribution which asymptotically arises, we gain the result. I 

Note that Theorem 3 reduces to Theorem 2 of Takemura and Sheena (2005) for the extreme 
case of 1-element blocks = 1, s = 1, . . . ,p. Therefore Theorem 3 is a generahzation of Theorem 
2 of Takemura and Sheena (2005). 



3 Application to Estimation of E 

3.1 Loss Functions and Orthogonally Equi variant Estimators 

In this section, we apply the asymptotic result on the distribution of S to the estimation of S 
when P/a vanishes. We take a decision-theoretic approach to evaluate the performance of the 
estimators. We deal with the two loss functions; one is Stein's loss (entropy loss) function 

Li(S,S) = tr(SS-^) - log|ES-^| - p, (24) 

and the other is a scale-invariant quadratic loss function 

L2(S,S) =tr(SE-^- Jp)^ (25) 

The associated risk functions are denoted as 

i?rf(S,S) =^[Ld(S,S)], d=l,2. 

The classical estimator of S is the unbiased estimator 

which has been widely used for many statistical analysis, especially with statistical software pack- 
ages. However, as James and Stein (1961) showed, this estimator is neither minimax nor admissible 
with Stein's loss function (24). The same drawback with respect to the quadratic loss function 
(25) was reported by Olkin and Selliah (1977). Following these initiative papers, much literature 
has been written seeking for a superior estimator to S^. Sec Pal (1993) for the review on the 
estimation of S. In this paper we only refer to orthogonally equivariant estimators proposed by 
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Stein (1977), Dey and Srinivasan (1985) and Krishnamoorthy and Gupta (1989). An estimator of 
the form 

E = G*(X)G', ^{L) = diag(V'i(0, ■ ■ ■ , V'p(O) 

is called orthogonally equivariant; i.e., S(GSG') = GS(5)G', VG e C(p). 

Stein (1977) and Dey and Srinivasan (1985) proposed the orthogonally equivariant estimator, 
S^^^, defined by 

^,(0 = /iAf, l<t<P, 

where Af^ = (n+p+1 — 2i)~^. J^^^'^ is of simple form but dominates with substantially better 
risk w.r.t the loss function (24). It is also a minimax estimator. See Dey and Srinivasan (1985) and 
Sugiura and Ishibayashi (1997) for more details. Order preservation among ipi{l), i = 1, . . . ,p, is 
discussed in Sheena and Takemura (1992). 

The orthogonally equivariant estimator E^*^ is defined by 

A{l)^hA?^ l<i<P: 

where Af^ is given by 

(A?^...,Af)' = A-i6 
with a. p X p matrix A — (ojj) and a p x 1 vector b — (6j) defined by 



' {n + p-2i + l){n + p-2i + 3), if i^j, 

: < (n + p-2i + l), if i>j, 

^ ln + p-2j + l), if j>i, 
hi = n + p+l-2i, i = l,...,p. 

S^*^ is conjectured to be a minimax estimator which dominates w.r.t. the loss function (25). 
This was proved by Sheena (2002) for the case p — 2. 

In this section we only consider orthogonally equivariant estimators given by 

^i{l) = Cili, l<t<p (26) 
with some constant q (1 < i < p), or in the matrix expression, 

*(X) = L^/^CL^/"^, C = diag(ci, ...,Cp). 

It is interesting that S^-^'^ and S^"^ are also the minimum risk estimators among; the estimators of 
the form (26) respectively for Li(-, ■) and L2{-, ■) when all the population eigenvalues are dispersed. 
See Takemura and Sheena (2005) for more details. 

3.2 Asymptotic Risk 

This subsection is devoted to the calculation of the asymptotic risks -Rd(S, S) 

Rd{% S) = lim Ra{% S), d = 1, 2, 

p/a— >0 
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for an orthogonally cquivariant estimator defined by (26). Note that 

i?i(S,S) = E[iYGL''^CL''^G'TA-^T']-\og\C\-E^og\i:-^'^Si:-^'^\]-p 

= E[t.GL'/'CL"'G'TK-'T'] - j^logc, - ^^^[logxL.+i] - P- (27) 



i=l i=l 



i?2(S,S) = E[ii{GL^'^CL^'^G'T\-^T' - Ip 

= E[iY{GL^'^CL^'^G'TK-^T'f] - 2^[tr GL^/^CL^'^G'TK-^T'] + p. (28) 



For the evaluation £;[log |S-V25S-V2|]^ see e.g. (10) in p.l32 of Muirhead (1982). 
We start with the following lemma, the proof of which is given in Appendix. 

Lemma 2 

lim EIIyGL^'^CL^'^G'TA-^T'] 

= E[trGnDY^CiDl/^G[,S^'] + E[ii G22Dy^ C^Dy G'^^'E^^] 

+(p-m)trCi, (29) 

lim E[tT{GL^/''CL^'''G'TK-^T'f] 

yS/a— >0 

= £;[tr(GnA'/'CiA'^'GnSr')1 +i^[tr(G22£>2^'C2£>rG^22S2-')'] 

+2(p - m)E[iYClD\'^G\^'E:[^GiiD\^^] + 2tr Ci£;[tr Hj^Gasi^a^'CaA'^'Gy 

m 

+{p-m){p-m + 2)J2c''i +2{p-m) ^ qc„ (30) 

i=l l<i<s<m 

where the expectations on the right-hand side in (29) and (30) are taken with respect to the dis- 
tributions in (22) and the decompositions in (23). 

Now suppose that under the distribution of Wss-, s = 1, 2, in (22) and their spectral decompo- 
sition in (23), we estimate H^, s = 1, 2, by the following orthogonally equivariant estimators 

Si = GnA'^'CiA'^'G'ii, Ci = diag(ci,...,cJ, 

H2 = G22£>2^^C'2£>2^^<^22, C2 = diag(c^+i, . . . , Cp), 

then the risks w.r.t. each loss function (24), (25) are given by 

i?n(Si,Si) = S[tr(HiHr^)-log|HiHrV^], 
i?2i(H2,32) = ^[tr(H2S2^) - log|H2H2^| -p + m], 

i?i2(3i,3i) = ^[tr(HiHr^-/J'], 
i?22(S2,S2) = £;[tr(S2S2-^ - V 



-mj 



The following theorem gives the decomposition of the asymptotic risk, i?d(S,S), into the risks 
R\di ^id s-nd the residuals R-^d for d— 1,2. 
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Theorem 4 For 1,2, 

S) = i?id(Si, Hi) + i?2d(S2, S2) + Rsd, 

where 

m 
i=l 

and 

i?32 = 2(p - m)£;[tr CIdI^^G[iS^^GuDI^^] + 2 tr Ci£;[tr S2 ^G22£>2^^C'2£)2^^C;2 



22 J 



+{P - m)(p - m + 2) ^ c- + 2(p - m) ^ qc^ - 2(p - m) ^ q. 

i=l l<i<s<m i=l 

All the expectations are taken with respect to the distributions (22) and the decompositions (23). 
Proof. Prom (27), 

m m 

i?n(Si,Ei) = E[irG,M'^CM^^G',,E.-^]-Y.\ogc,-Y.E[\ogxU+,]-m, 



p p 



i?2i(S2,S2) = £;[trG22A/ C2A/ G22S2-']- E logci- E ^[logxLi+i]-P + ^- 

i=m+l i=m+l 

Using (29) together with the above result, we have the result for -Ri(S, S). Prom (28), 
i?i2(Si,Si) = E[tTiG^M^''cM^'G[,Si'f] - 2E[tTG^M^'C^Dl^'G[,Si'] + m, 

i?22(S2,S2) = ^[tr(G22i:>2^'C2i:)2^'G^2S^')'] -2^[tl-G22£>2^'C2i:)2^'G^2S^'] + P " m. 

Using (30) and (29) together with the above result, we have the result for i?2(S, S). | 
3.3 Minimum Asymptotic Risk Estimator 

Consider the model (1) and suppose Ti — • ■ • — Tjn{— r) in (2). Then a — r + a'^ and (3 — a"^ and 

3l = Im, ^2 = Ip-m- (31) 

This assumption may not be very realistic. However note that it is trivially satisfied in the one- 
factor model m = 1, which is frequently used in practice. In this subsection we focus on the 
estimation of S under the condition (31). In this case, since we have no unknown parameters 
anymore, the asymptotic risk is uniquely determined, hence we can derive the "best" i.e., minimum 
asymptotic risk estimator among the orthogonally equivariant estimators of the form (26). The 
following theorem gives the asymptotic risk for the case (31). 

Theorem 5 //Hi = /„, H2 = Ip-m, then the asymptotic risk Rai^-, S), c? = 1, 2, is given by the 
following function of c= (ci, . . . , Cp)'. 

i?i(S,E) = E(6,Q-logQ)-Ei?[logxL,+i]-p, (32) 

1=1 i=l 

^2(S,S) = c'Ac-2b'c + p, (33) 
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where b = (61, ... , bp)' is given by 

bi^l 



E[di] + p — m, if 1 < i < m, 
E[di], if m + 1 < i < p, 



and p X p symmetric matrix A — (ojj ) is given by 

E[d1 + 2(p — m)di\ + (p — m)(p — m + 2), if 1 < i = j < m, 

E[d1], ifm + l<i^j<p, 

p — m, if i < i ^ j < in, 

E[dj], if 1 < i < m < j < j9, 

E[di], if 1 < j < m < i < p, 

0, otherwise. 

All the expectations are taken with respect to the distribution (22) and the decompositions (23) 
with Hi = J^, S2 = Ip-m- 

Proof. Evaluating Rjd{T,j, Sj), 1 < j, d < 2 in Theorem 4 when Hi = J^, H2 = Ip-m, we have 
the following results. 



3i) 



E[Li(Hi,/„)] = E[trHi - log|Hi| - m] 

m m 

E[Y,diCi-'^og\Wii\\ -^logQ-m 



1=1 



-^21(22, H2) 
-Ri2(Si, Hi) 



J2 E[di]ci - E[\og \Wu\]-Yl log - ^- 

i=l i=l 
P P 

E[di]ci- E[\og\W22\]- E logQ-p + m. 



i=m+l 



i=TO+l 

E[L2(Hi,J„)] = E[tr(Hi 

m m 

E[ti - 2 tr Hi] + m = ^ d^^cf - 2 ^ 



+ m 



i=l 



i?22(S2, H2) 



Y^E[d,']c^-2Y,E[d,]c, + m. 

i=l i=l 

X: E[d,']c^-2 E[di]ci+p-m. 



i=m+l i=m+l 

Next we calculate -R32 in Theorem 4 when di = 1^, £12 = Ip-m- 

Note that 

2(p - m)£;[tr Ci^Di/^G'iiHr^GiiDi/^] = 2(p - m)£;[tr Ci^Di] 



2 tr CiE[tr H2 ^G22-D2^^C'2£>2^^G!22] 



2(p-m)^^[(i,]c2, 

i=l 

m p 
i=l i=m+l 
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Therefore 



^32 = J^cU{p-m)ip-m + 2) + 2{p-m)E[di]} 



i=l 

m 

+2{p -m) CiCs + 2 CiCsE[ds] - 2{p - m) ^ q. 

l<.i<s<m l<i<m<s<.p i=l 

Combining above results, we see that (32) and (33) hold. 

CoroUciry 1 The minimum asymptotic risk with respect to the loss function Li{-, •) is given by 

p p 



It is attained by S ^ given by Ci = b^ , i — 1, . . . ,p. The minimum asymptotic risk with respect 
to the loss function L2{-,-) is given by 

p - b'A-^b. 

It is attained by 5]*^^2 giy^^ by c= A~^b. 

Proof. The results are easily obtained by the minimization X^f=i(6iQ — logCj) or c'Ac — 2b' c. | 

The calculation of the asymptotic risks in Theorem 5 and the q's of and 5]*^^2 requires 

the evaluation of E[di], E[d^],i — 1, . . . ,p, that is, the first and the second moment of the eigen- 
values of the Wishart distribution with the identity covariance matrix. Generally we need to make 
use of Monte Carlo simulation or numerical integration for the evaluation of the moments of the 
eigenvalues. However when p is small and n is appropriately even or odd depending on p, the 
analytic evaluation is feasible. See Section A. 3 in Appendix for this evaluation. 

Tables 1-5 give q's for S^, S^-^^, S^^, S^^i, S^^^ ^^en P = 3, 4 with several values of n. 
The value of q's for the minimum asymptotic risk estimators j^'^^i ^j^^^^ jg calculated by the 
aforementioned analytic method. Note that for the case p = 2, the minimum asymptotic risk 
estimator naturally coincides with S'^-^'^(5]^'^) which is the minimum asymptotic risk estimator 
for Li(L2) when we see the total dispersion of population eigenvalues (see Takemura ans Sheena 
(2005)). As it is well known, n'^k {i — 1, . . . ,p) tends to overestimate the corresponding eigenvalue 
of S when i is small, while it tends to underestimate the corresponding eigenvalue of S when i 
is large. The estimators T:^^^, "S^^ modify this tendency by increasing weight ci < ■ ■ ■ < Cp. It 
IS seen from the tables that S^^S S^^^ enlarge the weight difference within each block in most 
cases; for example whenp = 4, m = 2, the relation between q's of S'^^'^(S^*^) (say c^^^{cf^), i = 
1, . . . , 4) and those of E^^^ (S*^^^) (g^y cf (cf ^^), i = 1, . . . , 4) is found as 



and 



^MA2 ^ KG KG MA2 MA2 ^ KG KG MA2 



The tables also give asymptotic risk comparison w.r.t. L\ among the estimators E^, S'^^'^, 
S^^^i (see "AsyRiskl") and that w.r.t. among the estimators S^, S^^, S^^^ (gee "AsyRisk2"). 
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Table 1: p — 3,m — 1 



n = 4 




^SDS 






gMA2 


n = 6 












Cl 


0.2500 


0.1667 


0.1060 


0.1667 


0.1019 


Cl 


0.1667 


0.1250 


0.0855 


0.1250 


0.0828 


C2 


0.2500 


0.2500 


0.1332 


0.2000 


0.1321 


C2 


0.1667 


0.1667 


0.1030 


0.1304 


0.0977 


C3 


0.2500 


0.5000 


0.1902 


1.0000 


0.4491 


C3 


0.1667 


0.2500 


0.1352 


0.4286 


0.2675 


Asy.Riskl 


2.1969 


1.6592 




1.4392 




Asy.Riskl 


1.2387 


0.9820 




0.8270 








24.47 




34.49 




R.R.R. 




20.72 




33.23 




Asy.Risk2 


3.0000 




1.4120 




1.2792 


Asy.Risk2 


2.0000 




1.1056 




0.9644 


R.R.R. 






52.93 




57.36 


R.R.R. 






44.72 




51.78 




n = 8 


& 










n = 10 






^KG 


Y;MAi 




Cl 


0.12.50 


0.1000 


0.0724 


0.1000 


0.0707 


Cl 


0.1000 


0.0833 


0.0630 


0.0833 


0.0619 


C2 


0.1250 


().12r)() 


O.OS 19 


o.ooso 


0.0782 


C2 


0.1000 


0.1000 


0.0723 


0.0790 


0.01)55 


c?, 


0.1250 


0.1667 


0.1053 


0.2632 


0.1878 


C3 


0.1000 


0.1250 


0.0865 


0.1872 


0.1438 


Asy.Riskl 


0.8749 


0.7187 




0.5966 




Asy.Riskl 


0.6765 


0.5692 




0.4676 




R.R.R. 




17.85 




31.81 




R.R.R. 




15.85 




30.88 




Asy.Risk2 


1.5000 




0.9140 




0.7812 


Asy.Risk2 


1.2000 




0.7817 




0.6591 


R.R.R. 






39.07 




47.92 


R.R.R. 






34.86 




45.07 



n = 20 














n = 50 


St' 










Cl 


0.0500 


0.0455 


0.0385 


0.0455 


0.0383 




Cl 


0.0200 


0.0192 


0.0179 


0.0192 


0.0178 


C2 


0.0500 


0.0500 


0.0418 


0.0410 


0.0370 




C2 


0.0200 


0.0200 


0.0185 


0.0173 


0.0166 


<'?, 


0.0500 


0.0556 


0.0460 


0.0735 


0.0617 






0.0200 


0.0208 


0.0193 


0.0218 


0.02.36 


AsyHihkl 


0.3101 


0.2819 




0.2251 






Asy.Riskl 


0.123() 


0.1155 




0.0901 




R.R.R. 




10.89 




28.84 






R.R.R. 




6.51 




27.05 




Asy.Risk2 


0.6000 




0.4598 




0.3745 




Asy.Risk2 


0.2400 




0.2093 




0.1647 


R.R.R. 






23.37 




37.59 




R.R.R. 






12.79 




31.39 



Table 2: p = 3, m = 2 



n = 5 




'^SDS 


gXG 






n = 7 


SC' 




'^KG 




'^MA2 


Cl 


0.2000 


0.1429 


0.0944 


0.1154 


0.0891 


Cl 


0.1429 


0.1111 


0.0784 


0.0893 


0.0726 


C2 


0.2000 


0.2000 


0.1158 


0.3000 


0.1791 


C2 


0.1429 


0.1429 


0.0930 


0.2083 


0.1417 


C3 


0.2000 


0.3333 


0.1580 


0.3333 


0.1464 


C3 


0.1429 


0.2000 


0.1184 


0.2000 


0.1122 


Asy.Riskl 


1.5769 


1.3073 




1.2107 




Asy.Riskl 


1.0238 


0.8688 




0.7801 




R.R.R. 




17.10 




23.23 




R.R.R. 




15.14 




23.81 




Asy.Risk2 


2.4000 




1.2543 




1.1919 


AsyRisk2 


1.7143 




1.0182 




0.9455 


R.R.R. 






47.74 




50.34 


R.R.R. 






40.61 




44.84 




n = 9 


& 


^SDS 






Y;MA2 


Jl = 11 


S^ 




^KG 


5^MAi 




Cl 


0.1111 


0.0909 


0.0674 


0.0732 


0.0616 


Cl 


0.0909 


0.0769 


0.0592 


0.0623 


0.0537 


C2 


0.1111 


0.1111 


0.0781 


0.1577 


0.1162 


C2 


0.0909 


0.0909 


0.0674 


0.1260 


0.0980 


C3 


0.1111 


0.1429 


0.0950 


0.1429 


0.0914 


C3 


0.0909 


0.1111 


0.0794 


0.1111 


0.0771 


Asy.Riskl 


0.7635 


0.6592 




0.5793 




Asy.Riskl 


0.6107 


0.5342 




0.4622 




R.R.R. 




13.66 




24.12 




R.R.R. 




12.52 




24.31 




Asy.Risk2 


1.3333 




0.8585 




0.7821 


Asy.Risk2 


1.0909 




0.7430 




0.6663 


R.R.R. 






35.61 




41.34 


R.R.R. 






31.89 




38.93 



n = 21 






'^KG 




-^MA2 




n = 51 


SC' 




'^KG 


-^MAi 


-^MA2 


Cl 


0.0476 


0.0435 


0.0371 


0.0361 


0.0331 




Cl 


0.0196 


0.0189 


0.0175 


0.0164 


0.0158 


C2 


0.0476 


0.0476 


0.0401 


0.0613 


0.0537 




C2 


0.0196 


0.0196 


0.0182 


0.0232 


0.0220 


C3 


0.0476 


0.0526 


0.0439 


0.0526 


0.0435 




C3 


0.0196 


0.0204 


0.0189 


0.0204 


0.0189 


Asy.Riskl 


0.3040 


0.2755 




0.2281 






Asy.Riskl 


0.1191 


0.1117 




0.0881 




R.R.R. 




9.37 




24.97 






R.R.R. 




6.21 




25.99 




Asy.Risk2 


0.5714 




0.4473 




0.3815 




Asy.Risk2 


0.2353 




0.2066 




0.1666 


R.R.R. 






21.71 




33.24 




R.R.R. 






12.20 




29.18 
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Table 3: p = 4, m = 1 



n = 5 




'±SDS 






^MA2 




n = 7 












Cl 


0.2000 


0.1250 


0.0822 


0.1250 


0.0759 




Cl 


0.1429 


0.1000 


0.0690 


0.1000 


0.0647 


C2 


0.2000 


0.1667 


0.0973 


0.1200 


0.0927 




C2 


0.1429 


0.1250 


0.0796 


0.0883 


0.0726 


C3 


0.2000 


0.2500 


0.1222 


0.3333 


0.2310 




C3 


0.1429 


0.1667 


0.0959 


0.2000 


0.1559 


C4 


0.2000 


0.5000 


0.1746 


1.5000 


0.6931 




C4 


0.1429 


0.2500 


0.1259 


0.5956 


0.3816 


Asy.Riskl 


3.0752 


2.0603 




1.5303 






Asy.Riskl 


1.8508 


1.2955 




0.9241 




R.R.R. 




33.00 




50.24 






R.R.R. 




30.01 




50.07 




Asy.Risk2 


4.0000 




1.8435 




1.4655 




Asy.Risk2 


2.8571 




1.4923 




1.1116 








53.91 




63.36 




R.R.R. 






47.77 




61.10 




n = 9 














n = 11 












Cl 


0.1111 


0.0833 


0.0600 


0.0833 


0.0571 




Cl 


0.0909 


0.0714 


0.0533 


0.0714 


0.0513 


C2 


0.1111 


0.1000 


0.0681 


0.0707 


0.0602 




C2 


0.0909 


0.0833 


0.0596 


0.0593 


0.0517 


C3 


0.1111 


0.1250 


0.0798 


0.1429 


0.1179 




C3 


0.0909 


0.1000 


0.0685 


0.1111 


0.0949 


C4 


0.1111 


0.1667 


0.0990 


0.3497 


0.2553 




C4 


0.0909 


0.1250 


0.0819 


0.2413 


0.1890 


Asy.Riskl 


1.. 34.36 


0.9790 




0.6852 






Asy.Riskl 


1.0585 


0.7956 




0.5496 




R.H.H. 




27.13 




19.00 






R,.R,.R,. 




21.81 




18.08 




Asy.Risk2 


2.2222 




1.2591 




0.9083 




Asy.Risk2 


1.8182 




1.0927 




0.7730 


R.R.R. 






43.34 




59.13 










39.90 




57.49 



n = 21 






'^KG 




gMA2 




n = 51 




'^SDS 


'^KG 






Cl 


0.0476 


0.0417 


0.0346 


0.0417 


0.0344 




Cl 


0.0196 


0.0185 


0.0170 


0.0185 


0.0469 


C2 


0.0476 


0.0455 


0.0372 


0.0338 


0.0311 




C2 


0.0196 


0.0192 


0.0176 


0.0154 


0.0448 


C3 


0.0476 


0.0500 


0.0404 


0.0526 


0.0483 




C3 


0.0196 


0.0200 


0.0182 


0.0204 


0.0497 


C4 


0.0476 


0.0556 


0.0444 


0.0879 


0.0782 




C4 


0.0196 


0.0208 


0.0189 


0.0278 


0.0266 


Asy.Riskl 


0.5127 


0.4183 




0.2769 






Asy.Riskl 


0.2016 


0.1777 




0.1122 




R.R.R. 




18.41 




45.99 






R.R.R. 




11.88 




44.36 




Asy.Risk2 


0.9521 




0.()708 




0. 152(i 




Asy.Risk2 


0.3922 




0.3207 




0.2055 


R.R.R. 






29.57 




52.47 




R.R.R. 






18.22 




47.59 



The risks are analytically calculated except for evaluating Xlf=i -^[logx^.j.,.^] by Monte Carlo sim- 
ulation method using 10^ random numbers. "R.R.R." under "Asy.Riskl" or "Asy.Risk2" shows 
the risk reduction rate defined by 

^ ^ ^ ^ ^ The risk of — The risk of S 

R.R.R. of S = ^ X 100. 

The risk of 

It has been observed that Yl'^^^ and T,^^ drastically reduce the risk of when the population 
eigenvalues are close to each other. Lin and Perlman (1985) reports that when I] = Ip, R.R.R. 
of Y;^^^ often reaches 70%. See also Sugiura and Ishibayashi (1997) for a risk comparison by 
elabarate simulation. In the situation of the block-wise dispersion, the risk reduction rate of these 
estimators rarely approaches 50%. Especially when n is as large as 50, the rate is always under 
20%. On the other hand, the risk reduction rates of S^^^i and S^^a are constantly over 30% and 
often reach 50% irrespective of the values of n. It is interesting that 5]^^2 always outperforms 
^MAi ^-g^ R.R.R. 

3.4 Simulation studies 

In this subsection, we evaluate the performance of 'S^^'^, Y,^^'^ by Monte Carlo simulation 
under the situation (31). As we saw in the previous subsection, in view of the asymptotic risks, 
gMAi^ gMA2 provide better risk reduction compared to Y,^^^ , . In practical point view, 
however, it is important to see how largely the population eigenvalues must be dispersed so that 
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Table 4: p = 4, m = 2 



n = 5 










^MA2 


n = 7 


TP 


'•^SDS 






^MA2 


Cl 


0.2000 


0.1250 


0.0822 


0.1034 


0.0762 


Cl 


0.1429 


0.1000 


0.0690 


0.0820 


0.0632 


C2 


0.2000 


0.1667 


0.0973 


0.2.308 


0.1261 


C'2 


0.1129 


0.12.50 


0.0796 


0.1724 


0.1055 


C3 


0.2000 


0.2500 


0.1222 


0.2000 


0.117:-! 




0.1129 


0.1(K)7 


().09."')9 


0.i:-!01 


0.0885 


C4 


0.2000 


0.5000 


0.1746 


1.0000 


0.3988 


C4 


0.1429 


0.2500 


0.1259 


0.4286 


0.2425 


Asy.Riskl 


3.0752 


2.2687 




1.9819 




Asy.Riskl 


1.8508 


1.4334 




1.2107 




R.R.R. 




26.23 




35.55 




R.R.R. 




22.55 




34.59 




Asy.Risk2 


4.0000 




1.8668 




1.7317 


Asy.Risk2 


2.8571 




1.5273 




1.3728 








53.33 




56.71 


R.R.R. 






46.54 




51.95 



n = 9 










gMA2 




n = 11 


St' 








'^MA2 


Cl 


0.1111 


0.0833 


0.0600 


0.0682 


0.0546 




Cl 


0.0909 


0.0714 


0.0533 


0.0586 


0.0482 


C2 


0.1111 


0.1000 


0.0681 


0.1362 


0.0910 




C2 


0.0909 


0.0833 


0.0596 


0.1119 


0.0798 


<■■', 


0.1111 


0.1250 


0.0798 


0.0980 


0.0719 




c:', 


0.0909 


0.1000 


0.0685 


0.0790 


0.0609 


(-■4 


0.1111 


0.1607 


0.0990 


0.2(>32 


0.1727 




C4 


0.0909 


0.1250 


0.0819 


0.1872 


0.1337 


Asy.Riskl 


1.3436 


1.0774 




0.8908 






Asy.Riskl 


1.0585 


0.8700 




0.7080 




R.R.R. 




19.81 




33.70 






R.R.R. 




17.81 




33.11 




Asy.Risk2 


2.2222 




1.2992 




1.1422 




AsyRisk2 


1.8182 




1.1337 




0.9792 


R.R.R. 






41.54 




48.60 




R.R.R. 






37.65 




46.14 



n = 21 




'^SDS 


'^KG 


'^MAi 


^MA2 




n = 51 


sf 


'^SDS 


'^KG 


-^MAi 


-^MA2 


Cl 


0.0476 


0.0417 


0.0346 


0.0349 


0.0330 




Cl 


0.0196 


0.0185 


0.0170 


0.0162 


0.0153 


C2 


0.0476 


0.0455 


0.0372 


0.0577 


0.0531 




C2 


0.0196 


0.0192 


0.0176 


0.0227 


0.0211 


C3 


0.0476 


0.0500 


0.0404 


0.0410 


0.0352 




C3 


0.0196 


0.0200 


0.0182 


0.0173 


0.0163 


C4 


0.0476 


0.0556 


0.0444 


0.0735 


0.0615 




C4 


0.0196 


0.0208 


0.0189 


0.0248 


0.0232 


Asy.Hibkl 


0.5127 


0. 1177 




0.3177 






Abv.Riskl 


0.2010 


0.1S57 




0.1377 




R.R.R. 




12.68 




32.18 










7.90 




31.73 




Asy.Risk2 


0.9524 




0.7013 




0.5722 




Asy.Risk2 


0.3922 




0.3331 




0.2544 


R.R.R. 






26.36 




39.92 




R.R.R. 






15.06 




35.13 



Table 5: p = 4, m = 3 



n = 4 




^SDS 






gMA2 




n = 6 


sf 


Y;SDS 


'^KG 


'^MAi 


-^MA2 


Cl 


0.2500 


0.1429 


0.0919 


0.1071 


0.0852 




Cl 


0.1667 


0.1111 


0.0749 


0.0812 


0.0678 


C2 


0.2500 


0.2000 


0.1111 


0.2500 


0.1670 




C2 


0.1667 


0.1429 


0.0873 


0.1667 


0.1248 


C3 


0.2500 


0.3333 


0.1449 


0.6000 


0.2383 




C3 


0.1667 


0.2000 


0.1072 


0.3733 


0.2028 


C4 


0.2500 


1.0000 


0.2174 


1.0000 


0.1698 




C4 


0.1667 


0.3333 


0.1461 


0.3333 


0.1209 


Asy.Riskl 


4.8592 


3.6569 




3.4447 






Asy.Riskl 


2.2985 


1.7446 




1.5186 




R.R.R. 




24.74 




29.11 






R.R.R. 




24.10 




33.93 




Asy.Risk2 


5.0000 




2.0872 




1.9697 




Asy.Risk2 


3.3333 




1.6702 




1.5097 


R.R.R. 






58.26 




60.61 




R.R.R. 






49.89 




54.71 


n = 8 


& 




^KG 




Y,MA2 




71 = 10 


sf 


Y^SDS 


^KG 


Y,MAi 


Y,MA2 


Cl 


0.1250 


0.0909 


0.0642 


0.0660 


0.0569 




Cl 


0.1000 


0.0769 


0.0565 


0.0560 


0.0493 


C2 


0.1250 


0.1111 


0.0733 


0.1250 


0.0999 




C2 


0.1000 


0.0909 


0.0636 


0.1000 


0.0833 


C3 


0.1250 


0.1429 


0.0870 


0.2591 


0.1670 




C3 


0.1000 


0.1111 


0.0737 


0.1944 


0.1385 


C4 


0.1250 


0.2000 


0.1108 


0.2000 


0.0966 




C4 


0.1000 


0.1429 


0.0896 


0.1429 


0.0810 


Asy.Riskl 


1.5538 


1.2032 




0.9929 






Asy.Riskl 


1.1828 


0.9327 




0.7412 




R.R.R. 




22.57 




36.10 






R.R.R. 




21.15 




37.34 




Asy.Risk2 


2.5000 




1.3948 




1.2111 




Asy.Risk2 


2.0000 




1.1991 




1.0067 


R.R.R. 






44.21 




51.56 




R.R.R. 






40.05 




49.66 


n = 20 


& 


Y^SDS 


^KG 


'^MAi 






n = 50 


& 


^SDS 


^KG 


Y,MAi 


Y;MA2 


Cl 


0.0500 


0.0435 


0.0358 


0.0326 


0.0303 




Cl 


0.0200 


0.0189 


0.0172 


0.0151 


0.0146 


C2 


0.0500 


0.0476 


0.0386 


0.0500 


0.0455 




C2 


0.0200 


0.0196 


0.0179 


0.0200 


0.0192 


C3 


0.0500 


0.0526 


0.0421 


0.0808 


0.0694 




C3 


0.0200 


0.0204 


0.0186 


0.0271 


0.0256 


C4 


0.0500 


0.0588 


0.0465 


0.0588 


0.0450 




C4 


0.0200 


0.0213 


0.0193 


0.0213 


0.0192 


Asy.Riskl 


0.5385 


0.4484 




0.3218 






Asy.Riskl 


0.2069 


0.1836 




0.1207 




R.R.R. 




16.73 




40.24 






R.R.R. 




11.28 




41.65 




Asy.Risk2 


1.0000 




0.7122 




0.5395 




Asy.Risk2 


0.4000 




0.3293 




0.2236 


R.R.R. 






28.78 




46.05 




R.R.R. 






17.68 




44.11 
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the use of S^^^ d=l,2, is recommended. The convergence speed of the distributions given in 
Theorem 2, which is an interesting topic by itself, is closely related to this problem. 

To see the convergence speed in both distributions and risks, we carried out Monte Carlo 
Simulation for the two cases p = 3, m = 1 and p = A, m = 1. In each case, we took 11 values 
1.0, 0.8, 0.6, 0.4, 0.2, 10~*(i = 1, . . . , 6) in the convergence parameter (3, while a is fixed at 1. We 
took three different values of n in each case and generated 10^ random Wishart matrices under 
given p,n,p. The result is given in Table 6 (p = 3, m = 1) and Table 7 (p = 4, m = 1). The upper 
part of each table shows the speed of the distributional convergence in Theorem 2. Note that 
when Si = /„, S2 = Ip-m, the asymptotic distribution of a diagonal element of Wgs, s = 1,2, 
is a distribution. The labels in the tables are given as follows with x^(q;), z{a) denoting the 
lower a percentage points of distribution with n degrees of freedom and the standard normal 
distribution, respectively ; 

Table 6 

Prob la = PiWii < x,^,(0.05)), Prob lb = P(Wii < x^(o.95)), 

Prob 2a = P{{W22)ii < xLi(0.05)), Prob 2b = P((W22)ii < xLi(0.95)), 

Prob 3a = P((T^22)22 < xLi(0-05)), Prob 3b = P((Tf 22)22 < xl-i(0.95)), 

Prob 4a = P((Z2i)ii < ^(0.05)), Prob 4b = P((£2i)ii < -^(0.95)), 

Prob 5a = P((Z2i)2i < ^(0.05)), Prob 5b = P((Z2i)2i < ^(0.95)), 

Table 7 _ _ 

Prob la = P{Wu < X^(0.05)), Prob lb = P(Wn < X^(0.95)), 

Prob 2a = P((W;22)ii < xLi(0.05)), Prob 2b = P((W22)ii < xLi(0.95)), 

Prob 3a = P{{W22)33 < xLi(0.05)), Prob 3b = P((W^22)33 < xLi(0.95)), 

Prob 4a = P{{Z2i)u < -2(0.05)), Prob 4b = P((Z2i)ii < ^(0.95)), 

Prob 5a = P((Z2i)3i < z(0.05)), Prob 5b = P((Z2i)3i < ^(0.95)). 

In the lower part of each table, "Risk 1_*" and "Risk 2_*" show the risks of the corresponding 
estimator 5]* respectively for Li and L2. The tables show that 

1. The convergence of the diagonal elements of Wgs, s = 1,2, is so rapid that when j3 = 0.1, 
the asymptotic distribution already gives a good approximation for the exact distribution. 
When P = 0.1, every probability of the diagonal elements is within 0.01 deviation from the 
exact asymptotic probability. 

2. The convergence speed of Z is quite slow compared to that of the diagonal elements of 
Wss, s = 1,2. For a good approximation as above, j3 must be as small as 10~^ or 10~^. 

3. The risks also rapidly converge to the asymptotic risks so that /? = 0.1 is small enough to 
give a good approximation. Actually all the risks in the tables when /3 = 0.1 are within the 
±5% interval centered at the exact asymptotic risk. 

4. The risk of 'E'^^^'^,d = 1,2, is always lower than that of the competing estimators. Most 
notably their superiority in risk is kept even when the population eigenvalues are all equal. 
It seems that S^'^'', d — 1,2, has robustness to the deviation from the dispersion of the 
population eigenvalues. 



20 



Table 6: p — 3,m— 1 



n = 10 


1 


0.8 


0.6 


0.4 


0.2 


10-i 


10"^ 


10-'^ 


10-'' 


10-" 


10-" 


Asymp. 


Prob la 


0.4994 


0.3992 


0.2814 


0.1551 


0.0695 


0.0534 


0.0501 


0.0491 


0.0507 


0.0491 


0.0508 


0.0500 


Prob 2a 


0.4091 


0.3273 


0.2321 


0.1321 


0.0677 


0.0558 


0.0516 


0.0489 


0.0504 


0.0495 


0.0502 


0.0500 


Prob 3a 


0.4121 


0.3302 


0.2311 


0.1317 


0.0684 


0.0564 


0.0503 


0.0499 


0.0505 


0.0499 


0.0518 


0.0500 


Prob 4a 


0.2024 


0.1799 


0.1502 


0.1072 


0.0597 


0.0385 


0.0263 


0.0255 


0.0294 


0.0429 


0.0499 


0.0500 


Prob 5a 


0.0001 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0000 


0.0269 


0.0500 


0.0500 


Prob lb 


0.9700 


0.9636 


0.9572 


0.9531 


0.9503 


0.9507 


0.9499 


0.9514 


0.9496 


0.9497 


0.9496 


0.9500 


Prob 2b 


0.9993 


0.9985 


0.9955 


0.9871 


0.9695 


0.9576 


0.9508 


0.9498 


0.9503 


0.9488 


0.9498 


0.9500 


Prob 3b 


0.9994 


0.9983 


0.9957 


0.9874 


0.9693 


0.9582 


0.9508 


0.9509 


0.9496 


0.9500 


0.9497 


0.9500 


Prob 4b 


0.6174 


0.6492 


0.6986 


0.7671 


0.8528 


0.8924 


0.9236 


0.9255 


0.9301 


0.9451 


0.9515 


0.9500 


Prob 5b 


0.4137 


0.4673 


0.5465 


0.6624 


0.7937 


0.8530 


0.8971 


0.8994 


0.9001 


0.9275 


0.9504 


0.9500 


Risk 1_U 


0.6769 


0.6753 


0.6786 


0.6779 


0.6777 


0.6778 


0.6784 


0.6757 


0.6759 


0.6758 


0.6800 


0.6765 


Risk 1.SDS 


0.4589 


0.4611 


0.4770 


0.5038 


0.5409 


0.5580 


0.5701 


0.5687 


0.5690 


0.5684 


0.5727 


0.5692 


Risk 1_MA1 


0.3595 


0.3644 


0.3824 


0.4091 


0.4400 


0.4553 


0.4677 


0.4668 


0.4677 


0.4660 


0.4704 


0.4676 


Risk 2_U 


1.1996 


1.1997 


1.2017 


1.1976 


1.1983 


1.1989 


1.1980 


1.1966 


1.2020 


1.1990 


1.2021 


1.2000 


Risk 2 JCG 


0.7117 


0.7132 


0.7228 


0.7407 


0.7641 


0.7748 


0.7815 


0.7812 


0.7806 


0.7811 


0.7839 


0.7817 


Risk 231A2 


0.6109 


0.6147 


0.6255 


0.6397 


0.6540 


0.6625 


0.6706 


0.6703 


0.6704 


0.6689 


0.6725 


0.6591 



n = 20 


1 


0.8 


0.6 


0.4 


0.2 


10-i 


10"^ 


lO-'^ 


10-* 


10"'' 


10"'= 


Asymp. 


Prob la 


0.6164 


0.1550 


0.2706 


0.1187 


0.0574 


0.0523 


0.0195 


0.0.506 


0.0517 


0.0191 


0.0498 


0.0.500 


Prob 2a 


0..511i 


().:-!8:-!7 


0.228.') 


0.1027 


O.OliOS 


0.05:-!7 


0.0511 


0.0198 


0.0505 


0.0190 


0.0191 


0.0500 


Prob 3a 


0.5081 


0.3856 


0.2309 


0.1043 


0.0594 


0.0528 


0.0511 


0.0508 


0.0493 


0.0513 


0.0499 


0.0500 


Prob 4a 


0.2493 


0.2196 


0.1684 


0.1100 


0.0560 


0.0377 


0.0257 


0.0264 


0.0328 


0.0483 


0.0513 


0.0500 


Prob 5a 


0.0015 


0.0015 


0.0008 


0.0002 


0.0000 


0.0000 


0.0000 


0.0000 


0.0003 


0.0451 


0.0497 


0.0500 


Prob lb 


0.9767 


0.9661 


0.9595 


0.9543 


0.9515 


0.9484 


0.9513 


0.9500 


0.9498 


0.9498 


0.9494 


0.9500 


Prob 2b 


0.9995 


0.9984 


0.9936 


0.9816 


0.9631 


0.9547 


0.9513 


0.9499 


0.9498 


0.9511 


0.9500 


0.9500 


Prob 3b 


0.9994 


0.9983 


0.9940 


0.9815 


0.9623 


0.9552 


0.9520 


0.9516 


0.9499 


0.9501 


0.9505 


0.9500 


Prob 4b 


0.5542 


0.6008 


0.6689 


0.7653 


0.8592 


0.8975 


0.9200 


0.9257 


0.9334 


0.9499 


0.9508 


0.9500 


Prob 5b 


0.3123 


0.3793 


0.4993 


0.6566 


0.8026 


0.8574 


0.8953 


0.8998 


0.8987 


0.9457 


0.9503 


0.9500 


Risk l.U 


0.3178 


0.3179 


0.3181 


0.3171 


0.3183 


0.3178 


0.3177 


0.3175 


0.3177 


0.3171 


0.3176 


0.3164 


Risk l^DS 


0.2363 


0.2390 


0.2486 


0.2628 


0.2767 


0.2802 


0.2829 


0.2830 


0.2833 


0.2827 


0.2832 


0.2819 


Risk IJVIAl 


0.1880 


0.1923 


0.2023 


0.2115 


0.2200 


0.2236 


0.2261 


0.2262 


0.2267 


0.2260 


0.2262 


0.2251 


Risk 2.U 


0.5995 


0.6011 


0.6008 


0.5992 


0.5999 


0.6006 


0.5992 


0.5987 


0.6005 


0.6003 


0.6011 


0.6000 


Risk 2_KG 


0.4085 


0.4117 


0.4226 


0.4384 


0.4530 


0.4568 


0.4595 


0.4598 


0.4600 


0.4593 


0.4594 


0.4598 


Risk 2JVIA2 


0.3563 


0.3606 


0.3674 


0.3706 


0.3744 


0.3775 


0.3792 


0.3794 


0.3801 


0.3797 


0.3793 


0.3745 



n = 50 


1 


0.8 


0.6 


0.4 


0.2 


10-i 


10-^ 


10-;^ 


10-* 


10"'' 


10-" 


Asymp. 


Prob la 


0.7358 


0.4769 


0.2076 


0.0793 


0.0532 


0.0503 


0.0484 


0.0506 


0.0506 


0.0485 


0.0501 


0.0500 


Prob 2a 


0.6110 


0.4089 


0.1725 


0.0720 


0.0549 


0.0511 


0.0512 


0.0498 


0.0505 


0.0489 


0.0499 


0.0500 


Prob 3a 


0.6079 


0.4100 


0.1737 


0.0732 


0.0564 


0.0513 


0.0495 


0.0484 


0.0487 


0.0489 


0.0504 


0.0500 


Prob 4a 


0.2992 


0.2529 


0.1788 


0.1042 


0.0541 


0.0368 


0.0271 


0.0274 


0.0411 


0.0493 


0.0500 


0.0500 


Prob 5a 


0.0200 


0.0109 


0.0031 


0.0002 


0.0000 


0.0000 


0.0000 


0.0000 


0.0072 


0.0490 


0.0506 


0.0500 


Prob lb 


0.9823 


0.9707 


0.9606 


0.9532 


0.9511 


0.9499 


0.9485 


0.9503 


0.9489 


0.9500 


0.9505 


0.9500 


Prob 2b 


0.9995 


0.9979 


0.9883 


0.9696 


0.9567 


0.9519 


0.9503 


0.9511 


0.9498 


0.9506 


0.9498 


0.9500 


Prob 3b 


0.9996 


0.9977 


0.9889 


0.9705 


0.9557 


0.9528 


0.9495 


0.9499 


0.9500 


0.9506 


0.9498 


0.9500 


Prob 4b 


0.5050 


0.5599 


0.6582 


0.7701 


0.8610 


0.8967 


0.9231 


0.9266 


0.9408 


0.9502 


0.9495 


0.9500 


Prob 5b 


0.2273 


0.3156 


0.4805 


0.6648 


0.8100 


0.8613 


0.8960 


0.8993 


0.9066 


0.9494 


0.9498 


0.9500 


Risk l.U 


0.1223 


0.1226 


0.1227 


0.1229 


0.1228 


0.1228 


0.1226 


0.1223 


0.1228 


0.1221 


0.1230 


0.1236 


Risk 1.SDS 


0.1006 


0.1026 


0.1074 


0.1117 


0.1137 


0.1143 


0.1145 


0.1143 


0.1148 


0.1140 


0.1149 


0.1155 


Risk IMAl 


0.0814 


0.0843 


0.0867 


0.0871 


0.0882 


0.0888 


0.0891 


0.0891 


0.0896 


0.0887 


0.0895 


0.0901 


Risk 2_U 


0.2391 


0.2399 


0.2400 


0.2401 


0.2402 


0.2404 


0.2403 


0.2391 


0.2405 


0.2389 


0.2406 


0.2400 


Risk 2J<;G 


0.1874 


0.1906 


0.1982 


0.2049 


0.2079 


0.2087 


0.2091 


0.2086 


0.2096 


0.2083 


0.2097 


0.2093 


Risk 2JvIA2 


0.1641 


0.1673 


0.1669 


0.1643 


0.1649 


0.1654 


0.1658 


0.1656 


0.1665 


0.1650 


0.1663 


0.1647 



Because of the robustness, S^^<^, d — 1,2, seem to be useful for various applications. Now 
as the last topic in this section, apart from a decision-theoretic approach, we evaluate these new 
estimators' performance in discriminant analysis. We use a well-known example of Fisher's iris 
data. The data consists of 50 samples from each of the three groups(species) with 4-dimensional 
variable (xiisepal length(cm), a;2:sepal width(cm), xs'.petal length(cm), a;4:petal width(cm)). We 
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Table 7: p = 4, m = 1 



n = 11 


1 


0.8 


0.6 


0.4 


0.2 


10-i 


10"^ 


10-'^ 


10-'' 


10-" 


10-" 


Asymp. 


Prob la 


0.5852 


0.4760 


0.3396 


0.1881 


0.0751 


0.0554 


0.0517 


0.0497 


0.0495 


0.0498 


0.0489 


0.0500 


Prob 2a 


0.3156 


0.2620 


0.1937 


0.1166 


0.0660 


0.0557 


0.0505 


0.0494 


0.0510 


0.0501 


0.0492 


0.0500 


Prob 3a 


0.3146 


0.2620 


0.1920 


0.1172 


0.0651 


0.0560 


0.0515 


0.0503 


0.0496 


0.0494 


0.0503 


0.0500 


Prob 4a 


0.1955 


0.1794 


0.1533 


0.1124 


0.0650 


0.0431 


0.0283 


0.0284 


0.0315 


0.0449 


0.0507 


0.0500 


Prob 5a 


0.1312 


0.1206 


0.1005 


0.0685 


0.0342 


0.0205 


0.0131 


0.0131 


0.0172 


0.0391 


0.0497 


0.0500 


Prob lb 


0.9761 


0.9676 


0.9610 


0.9547 


0.9521 


0.9510 


0.9500 


0.9510 


0.9495 


0.9508 


0.9501 


0.9500 


Prob 2b 


0.9986 


0.9977 


0.9948 


0.9863 


0.9690 


0.9581 


0.9505 


0.9510 


0.9507 


0.9509 


0.9493 


0.9500 


Prob 3b 


0.9985 


0.9979 


0.9947 


0.9861 


0.9681 


0.9570 


0.9509 


0.9515 


0.9499 


0.9508 


0.9508 


0.9500 


Prob 4b 


0.6525 


0.6774 


0.7152 


0.7789 


0.8576 


0.8979 


0.9251 


0.9276 


0.9310 


0.9454 


0.9501 


0.9500 


Prob 5b 


0.5899 


0.6184 


0.6607 


0.7327 


0.8304 


0.8751 


0.9091 


0.9115 


0.9185 


0.9399 


0.9508 


0.9500 


Risk 1_U 


1.0566 


1.0583 


1.0552 


1.0592 


1.0577 


1.0583 


1.0603 


1.0544 


1.0574 


1.0573 


1.0559 


1.0585 


Risk 1.SDS 


0.6514 


0.6572 


0.6714 


0.7092 


0.7558 


0.7781 


0.7954 


0.7920 


0.7943 


0.7942 


0.7927 


0.7956 


Risk 1_MA1 


0.4064 


0.4154 


0.4367 


0.4738 


0.5104 


0.5295 


0.5485 


0.5471 


0.5484 


0.5478 


0.5468 


0.5496 


Risk 2_U 


1.8199 


1.8213 


1.8147 


1.8170 


1.8175 


1.8199 


1.8210 


1.8147 


1.8206 


1.8180 


1.8176 


1.8182 


Risk 2 JCG 


1.0173 


1.0214 


1.0291 


1.0493 


1.0749 


1.0876 


1.0939 


1.0921 


1.0929 


1.0926 


1.0915 


1.0927 


Risk 2 J4A2 


0.5967 


0.6075 


0.6326 


0.6767 


0.7268 


0.7516 


0.7728 


0.7724 


0.7737 


0.7733 


0.7719 


0.7730 



n = 21 


1 


0.8 


0.6 


0.4 


0.2 


10-i 


10"^ 


lO-'^ 


10-4 


lO"'' 


10"'= 


Asymp. 


Prob la 


0.70.30 


0.5119 


0.3317 


0.1128 


0.0601 


0.0532 


0.0503 


0.0.505 


0.0198 


0.0195 


0.0509 


0.0.500 


Prob 2a 


0.1013 


0.3183 


0.2017 


0.0985 


O.O.'iTI) 


0.0517 


0.0508 


0.0195 


0.0192 


0.0505 


0.050.') 


0.0500 


Prob 3a 


0.3995 


0.3222 


0.2019 


0.0975 


0.0581 


0.0527 


0.0507 


0.0504 


0.0496 


0.0497 


0.0493 


0.0500 


Prob 4a 


0.2413 


0.2156 


0.1748 


0.1172 


0.0601 


0.0421 


0.0297 


0.0292 


0.0344 


0.0480 


0.0503 


0.0500 


Prob 5a 


0.1720 


0.1533 


0.1185 


0.0711 


0.0331 


0.0201 


0.0141 


0.0137 


0.0211 


0.0484 


0.0505 


0.0500 


Prob lb 


0.9830 


0.9737 


0.9627 


0.9557 


0.9502 


0.9506 


0.9503 


0.9504 


0.9497 


0.9505 


0.9505 


0.9500 


Prob 2b 


0.9989 


0.9977 


0.9929 


0.9809 


0.9617 


0.9548 


0.9507 


0.9501 


0.9488 


0.9500 


0.9514 


0.9500 


Prob 3b 


0.9988 


0.9975 


0.9935 


0.9805 


0.9628 


0.9549 


0.9509 


0.9502 


0.9501 


0.9496 


0.9487 


0.9500 


Prob 4b 


0.5985 


0.6278 


0.6881 


0.7757 


0.8657 


0.8998 


0.9262 


0.9272 


0.9339 


0.9476 


0.9494 


0.9500 


Prob 5b 


0.5303 


0.5632 


0.6291 


0.7282 


0.8368 


0.8794 


0.9118 


0.9130 


0.9219 


0.9479 


0.9504 


0.9500 


Risk l.U 


0.5121 


0.5136 


0.5135 


0.5116 


0.5115 


0.5128 


0.5110 


0.5127 


0.5115 


0.5109 


0.5119 


0.5127 


Risk l^DS 


0.3503 


0.3552 


0.3677 


0.3871 


0.4056 


0.4134 


0.4167 


0.4183 


0.4172 


0.4169 


0.4177 


0.4183 


Risk IJVIAl 


0.2241 


0.2315 


0.2461 


0.2568 


0.2650 


0.2715 


0.2759 


0.2772 


0.2759 


0.2764 


0.2765 


0.2769 


Risk 2.U 


0.9512 


0.9514 


0.9537 


0.9503 


0.9516 


0.9521 


0.9477 


0.9535 


0.9520 


0.9501 


0.9505 


0.9524 


Risk 2_KG 


0.6059 


0.6109 


0.6233 


0.6429 


0.6607 


0.6669 


0.6692 


0.6708 


0.6700 


0.6695 


0.6707 


0.6708 


Risk 2JVIA2 


0.3510 


0.3622 


0.3861 


0.4114 


0.4326 


0.4433 


0.4516 


0.4532 


0.4521 


0.4524 


0.4525 


0.4526 



n = 51 


1 


0.8 


0.6 


0.4 


0.2 


10-i 


10-^ 


10-;^ 


10-* 


10-'' 


10-" 


Asymp. 


Prob la 


0.8209 


0.5805 


0.2691 


0.0916 


0.0533 


0.0492 


0.0498 


0.0504 


0.0501 


0.0504 


0.0500 


0.0500 


Prob 2a 


0.5101 


0.3626 


0.1647 


0.0721 


0.0560 


0.0522 


0.0501 


0.0502 


0.0501 


0.0504 


0.0500 


0.0500 


Prob 3a 


0.5098 


0.3610 


0.1669 


0.0722 


0.0555 


0.0533 


0.0500 


0.0507 


0.0479 


0.0498 


0.0506 


0.0500 


Prob 4a 


0.2912 


0.2595 


0.1878 


0.1118 


0.0604 


0.0415 


0.0303 


0.0291 


0.0403 


0.0507 


0.0491 


0.0500 


Prob 5a 


0.2191 


0.1863 


0.1307 


0.0689 


0.0308 


0.0196 


0.0133 


0.0148 


0.0313 


0.0501 


0.0499 


0.0500 


Prob lb 


0.9891 


0.9762 


0.9649 


0.9548 


0.9507 


0.9501 


0.9504 


0.9501 


0.9504 


0.9505 


0.9497 


0.9500 


Prob 2b 


0.9992 


0.9970 


0.9889 


0.9700 


0.9573 


0.9526 


0.9502 


0.9498 


0.9503 


0.9507 


0.9504 


0.9500 


Prob 3b 


0.9990 


0.9973 


0.9891 


0.9712 


0.9565 


0.9521 


0.9499 


0.9513 


0.9503 


0.9506 


0.9492 


0.9500 


Prob 4b 


0.5383 


0.5836 


0.6703 


0.7803 


0.8683 


0.9022 


0.9272 


0.9286 


0.9411 


0.9494 


0.9503 


0.9500 


Prob 5b 


0.4666 


0.5129 


0.6101 


0.7334 


0.8386 


0.8789 


0.9081 


0.9153 


0.9312 


0.9496 


0.9503 


0.9500 


Risk l.U 


0.2018 


0.2022 


0.2019 


0.2017 


0.2019 


0.2017 


0.2020 


0.2017 


0.2020 


0.2023 


0.2018 


0.2016 


Risk 1.SDS 


0.1566 


0.1592 


0.1658 


0.1721 


0.1758 


0.1768 


0.1780 


0.1777 


0.1780 


0.1783 


0.1779 


0.1777 


Risk IMAl 


0.1037 


0.1083 


0.1109 


0.1088 


0.1104 


0.1113 


0.1125 


0.1124 


0.1123 


0.1124 


0.1125 


0.1122 


Risk 2_U 


0.3923 


0.3939 


0.3920 


0.3916 


0.3920 


0.3924 


0.3927 


0.3919 


0.3931 


0.3929 


0.3920 


0.3922 


Risk 2JiG 


0.2896 


0.2938 


0.3038 


0.3127 


0.3179 


0.3194 


0.3208 


0.3203 


0.3211 


0.3215 


0.3208 


0.3207 


Risk 2JvIA2 


0.1785 


0.1867 


0.1943 


0.1959 


0.2010 


0.2033 


0.2057 


0.2055 


0.2054 


0.2056 


0.2059 


0.2055 



downloaded the data from the website http://www-unix.oit.umass.edu/~statdata. We let 

x^j\ i — 1,2,3, j — 1, ... ,50 denote the jth sample in the iih group. The estimator to be tested 
are the traditional estimators S^, S'^^'^, S^*^ and the new estimators S^^^^, S^'^^ which are 
formulated under the condition p — 4,171 — 1. 

We carry out cross validations. Suppose a learning data set j — 1, . . . ,N, is chosen from 
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Table 8: 10-sample-set 



Learning 
Data Set 




^SDS 






^MA2 


1 


82.50 


83.33 


83.33 


81.67 


82.50 


2 


85.83 


85.00 


85.00 


85.00 


85.00 


3 


82.50 


82.50 


82.50 


82.50 


82.50 


4 


81.67 


83.33 


82.50 


85.83 


84.17 


5 


76.67 


77.50 


77.50 


79.17 


79.17 


Average 


81.83 


82.33 


82.17 


82.83 


82.67 



Table 9: 5-sample-set 



Learning 
Data Set 




^SDS 


^KG 






1 


66.67 


71.85 


68.89 


75.56 


75.56 


2 


78.52 


80.00 


78.52 


85.19 


82.96 


3 


41.48 


41.48 


41.48 


44.44 


42.96 


4 


43.70 


46.67 


45.93 


53.33 


50.37 


5 


88.89 


88.15 


88.89 


92.59 


90.37 


6 


73.33 


78.52 


77.78 


89.63 


88.15 


7 


64.44 


68.89 


67.41 


73.33 


71.85 


8 


73.33 


75.56 


72.59 


82. 9G 


79.26 


9 


73.33 


75.56 


72.59 


82.96 


79.26 


10 


69.63 


72.59 


71.85 


82.22 


77.78 


Average 


67.33 


69.93 


68.59 


76.22 


73.85 



the ith. group, i = 1,2, 3. Estimates for the population covariance matrix of the ith group are 
calculated from S^, S^-^^, S^^, S^^S S^^^ based on 

N 

where y*^*) = N^^J2^^^y^^\ As a discriminant function, we use a Mahalanobis distance based 
on each estimates S^(A«), S^^^(A»), S^^(A»), E*^^i(A«), S^^^^^W)^ that is, for a test 
data X 

MD* = {x- j/»)'S*(A(*))-i(a; - i = 1, 2, 3. 

The eigenvalues of the covariance matrix within each group is as follows; 

Group 1: (0.234, 0.039, 0.027, 0.009), 

Group 2: (0.482,0.075,0.056,0.011), (34) 
Group 3: (0.688, 0.107, 0.057, 0.036). 

We observe that 1) in each group, the largest eigenvalue are about 6 times as large as the 
second largest eigenvalue, 2) the second largest eigenvalue is about 3-7 times as large as the 
smallest eigenvalue. We are interested in the performance of E^^^<*, d = 1,2, with the population 
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eigenvalues in (34) which are considered as a deviation from (oo, c, c, c), the ideal eigenvalues for 
S^^S i = 1,2. 

We made three types of cross vahdations. 

1. Leave-one-out: For a chosen i — 1,2,3, j — 1,. . ., 50, leave Xj'^ out from the whole 
data to be a test data, and use the rest as a learning data set. We repeat this trial for every 
possible (i, j). Consequently 150 trials were carried out. 

2. 10-sample-set: First choose Xi \ . . ., Xiq, i = 1, 2, 3, as a learning data set and use all the 
rest as a test data. Next use Xil, . . ., X20, i = 1, 2, 3, as a learning data set and the others 
as a test data. Repeatedly change a learning data set until every data is used once as a 
learning data. Totally we carried out 600(= 120 x 5) trials. 

3. 5-sample-set: First choose learning data set and use all the rest 

as a test data. Next use Xq \ . . . , Xiq, i = 1, 2, 3, as a learning data set and the others as a 
test data. Repeatedly change a learning data set until every data is used once as a learning 
data. Totally we carried out 1350(= 135 x 10) trials. 

We summarize the result on the correct classification percentage ("C.C.P." for abbreviation) 
of each discriminant function. 

1. Leave-one-out: All the discriminant functions returned the same classification for every test 
data and scored 96.67% of C.C.P. The misclassification occurred at the sample Xig , X21 , 

(2) (2) is) 

, 2:34 , 0:32 . With as much as 49 learning data, all the discrininant functions work quite 
correctly and make no differences among the functions. 

2. 10-sample-set: See Table 8 for the C.C.P. in each learning data set and the average. De- 
pending on the learning data set, different discriminant functions records the best C.C.P, 
but the margins are small and negligible. It seems that even 10-sample-learning set is too 
large to differentiate the functions. 

3. 5-sample-set: See Table 9 for the C.C.P. in each learning data set and the average. In every 
learning data set, the functions based on S^'^'', d = 1,2, outperform the other functions. 
Especially S^^^ always keeps the highest C.C.P. In total, S^^i and S^^^ record better 
C.C.P. than by 8.89% and 6.52% respectively, while the margins of "S^^^ and S^*^ over 

are respectively 2.60% and 1.26%. 

A Appendix 

A.l Proof of Lemma 1 

In the following, q {i = 1, . . . , 7) represents some constant independent of a, j3. 

The random variables I = {h, . . . ,lp) and G — T'G have the following joint density function 
with respect to the product measure between Lebesgue measure on C and the invariant probability 
/i on 0+{p). 

P _n P n-p-l /I , , \ 

ci UK^Iih ' m-h) etr --GXG'A-M. 

i=l 1=1 j<i ^ ^ ^ 
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We have 

E[x{G,l,X,a,(3)] = E[x{TG,l,X,a,(3)] 

P _n r r ^ n — p—l 

= n / I <^G, I, A, a, P)l[k ' Uih - h) 

i=l JcJO{p)+ 

X etr [-]^GLG'Ar^^ dii{G)dl. 
Using the finite open cover 0^'^\ r = 0, . . . , T, in Subsection 2.1, we have 

E[x{G,l,\,a,P)] = j2lr, (35) 

r=0 

where 

P r P n-p-1 



Ir = ciY[\r I ir{G)x{TG,l,\,a,P)l[k ' Ilih-h) 
i=i JcJo+{p) 

X etr (^—GLG'A-^^ dii{G)dl, 



First we consider Iq. Let M denote the support of Lq. Prom (19), 

P _n f r P n—p—l ^ 

|/o| < c.Hx;^ / \x{TG,l,X,a,P)\llk ' Uiij - h 
i=i ■''^■'^ i=i j<i 



X etr (^-^GLG'A-^^ dfi{G)dl 



P n f f ^ n-p-1 /I _ \ 

< cibUXi ' / ' etT[--GLG'A-^)di^{G)dl 



i=i •'^•'^ i=i j<i 



C2 



P (G e M I S = TAr') , (36) 



where A = (1 — 2a) ^A. Note 0~^{p) \ M is an open set including 0{m,p — m), hence by 2 of 
Theorem 1, hm^/^^o P (G e C+(p) \ M | S = TAT') = 1, which means 

p (g e M I s = rAr') ^ o 

as /3/q; — > 0. Therefore 

hm /o = 0. (37) 

Now we focus ourselves on r = 1, . . . , T. Since /i is invariant and the support of Lr{G) is 
contained in 0^'''\ we have 

n — p — l 



= ciflA^M / LriH^^^G) x{TH(^^G,l,\,a,P)Yll, - W^-^ 

X etr (^—H'^^^GLG'H'^^^'A-^^ dii{G)dl. 
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We want to express the integral with respect to dii{G) in terms of the local coordinates u on U. 
It is well known that the invariant measure d^{G) has the exterior differential form expression 



C3 A 9■dg^, (38) 



where gi is the ith column of G. Substituting the differential 

dgij = duij, i > j, 

dgij = ^^duku i < j, 

into (38) and taking the wedge product of the terms, we see that 

A 9jd9i = ±J*{u) A duij, 

i>j i>j 

where J*{u) is the Jacobian expressing the Radon-Nikodym derivative of the measure on U induced 
from the invariant measure on C^(p) with respect to the Lebesgue measure on i?^^^2 ' . An explicit 
form of J*{u) for small dimension p is discussed in Appendix B in Takemura and Sheena (2005). 
Since J*{u) is a C°° function on tJ, it is bounded and has a finite limit as w ^ 0. By the change 
of variables {l,G) — > {l,u), I-j- is written as 



n— p— 1 



i=l ■"^■'^ i=\ j<i 

X etr (^-^H^^^G{u)LG'{u)H^^^'A-^^ J*{u)dudl, 
Consider further coordinate transformation (Z,w) —>■ {d,q) for each r. Notice 

Lih = [l[dj ' )a ^ P 2 , (39) 

m(m— 1) (p~m)(p — m~l) 



i=l i=l 



Wj-h) = a^^P ~ 2- ~ H (adj-pd^) n (dj-d^) n (dj-di) 

j<i j<m<i j<i<m m<j<i 

- n n id,-d,) n {d,-d.)f[dr 



j<m<i Oldj j<^i<m m<j<i j=l 



and 



mfa ml I '"('^-^) (p-m)(p-m-l) 

xa""^ mj+ 2 /5 2 ^ (40) 



tr H^-'^G{u)LG\u)H^^^'K-'^ 

- tr/f^^'^^^^^") ^^'^^^^("M diaea. n 

= tiHi^^Gn{u)D,G',,{u)Hi^^'S^' + tr H^^^ G22iu)D2G'^^{u)Hi^^"E^' 

+ tr Q21Q21 + a-'PtrHi^^Gi2{u)D2G[^{u)Hi^^"S^\ (41) 
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where u is actually the abbreviation for u{d, q, ^, a, j3) defined by (10). For notational simplicity 
we use the same abbreviation u — u{d, q, ^, a, (3) for the rest of this proof. Prom (12), (16), (39), 
(40) and (41), we have 

/r = C5 / , /' 6,(if(^)G(u))a:(d,q,^,a,/3;r,i/M)/i(d,q,^,a,/?)dddq 

where i?+ = {d\ di > 0, i = 1, . . . ,p} and h{d, q, ^, a, /3) is defined as follows; 
h(d,q,ta,p) = l(ueu)r{u)l{d,ev,, d2eV2, (d,,d2)eVs) 



m 

X 



It, — III — i.- It — p — J. ^ 

i[d~^ n n id,-di) n (^.-^o n (i 

i=l i=m+l j<i<m m<j<i j<m<i 3 



X exp(-i tr Y: Hi^^GUu)D,G'MHi^^'S:') 

^ s=\ 

xetr(-^Q2ig2i) X etr(-^/ff )Gi2(n)r>2G'i2(w)^f ^'^r' 



2a 

We will show that 

i,{H^'^G{u))x{d, q, t «, /3; r, i?(-))/i(d, q, |, a, /3) 

is bounded in {a,l3). First /(tt G U)J*{u) < K for some K {> 0) since J*(tt) is bounded on the 
compact set U . Clearly 

< /(di e d2 e ©2, (di, d2) e 1)3) 11 (i - ^) < i- 

3<m<i ^^3 

From the condition (19), we have 

|x(d,q,£,a,/3;r,i3-(-))| = \x{TH^^^G{u),l,\a, P)\ 

< b etr{aH^^^G{u)LG'{u)H^^^'A-^) a.e. in {d,q). 

Therefore 

|.,(if(^)G(w))x((Z, g, a, /?; T, /f W)/i(d, q, |, a, /?)| 

^ n— m— 1 _^ n— p— 1 

< CeI(ueU)l[d, ' U di ' n n \dj-di\ 

i=l i=m+l j<i<m m<j<i 

xexp(-l^trf:ifWG,,(n)AG;,(n)ijW'H;0 

xetr(-i^Q2iQ'2i)- (42) 

Note that 

I(u eU) < I(u e < /(l^ijl = \qij\ < e, I < j < i < m, m < j < i < p). 
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Choose some ^ such that C > 6) i = 1, • • • ,p. Consequently the left-hand side of (42) is bounded 
by h{d,q), where 

h{d, q) = C6 I{\qij\ < e, I < j < i < m, m< j < i < p) 

J''' n — m — 1 P n—p—l 

" U ' n \dJ-d^\ n \dj-di\ 

1=1 i=m+l j<i<m m<j<i 

xexp(-^{l-2a)Y,di) x etr( —^Q^.Q'^^y 

1=1 

Let ui — m{m — l)/2, V2 — {p — m){p — m — l)/2, i/^ — m{p — m). We have 

/ / hid, q) dqdd = / / / h{d, q) dqu dq22 dq2i dd 

Jr\ Ji?p(p- i)/2 Jrp Jwa 

f V". n—m — l P n—p—l 

= ' li ' n \dj-di\ n 

+ i=l i=m+l j<i<m m<j<i 

E-1 P p 1 — 2a 

X exp 

(- - '2o)Y.di)dd X / etr( —Q^^Q'^A dq2i 

X 1 dqu / 1 dq22 < oo. 

The integrability of h{d, q) guarantees the use of the dominated convergence theorem; From (18) 
and (20) 

lim Ir ^ C5 [ f lim UH^^^G(u)) lim x(d,q,^,a, P;T, H^^^) 

X lim h{d,q,$,,a, (3) dddq 

^ cJ [ Lr{H^^^G(qu,q22,0))xr{H(^^G{qn,q22,0),d,Q2i,^) 

X lim h{d,q,$,,a, (3) dddq. 

We consider hm^/a_>o h{d, q, a, (3). First notice that 



lim 7(di e Pi, d2 e V2, {di, ds) e P3) = l{di e I^i)/(d2 e P2), 

/3/a— »0 



From (17), we find 

lim J*(n) = J* (911,^22,0), 

lim I{u eU) = I{{qu,q22) = (^11,^22) e ^0), 

/5/a-+0 

where C/q = {(^^ii, U22)\{uu, U22, 0) G ?7} denotes the slice of U by U12 = 0, and that 

lim Gii(w) = Gn(qii, 922,0) e C+(m), 
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lim G22{u) = G22{qn, ^22, 0) G - m), 

lim G2i(w) = 0, 

/3/a-»0 

]im etr{-fHi^^G,2{u)D2G[,{u)Hi^^'S^') = 1. 

Since d// is invariant, especially w.r.t. both of the transformations 

G ^ diag{H,,H2)G, G ^ Gdiag(i?i, i/s), (43) 
the measure on Uq given by 

J*{qii:q22:0)dqiidq22 (44) 
induces the invariant measure on Vq, the slice of F by Gu — 0, w.r.t. (43) through 

G{qii, 922, 0) = diag(Gii(qii, 922, 0), ^22(^11, fe, 0)). (45) 

If Gil and G22 independently follow the invariant probability distributions respectively on C+(m) 
and {p — m), then the distribution on Vq given by 

Go = diag(Gii,G22) (46) 

is also invariant w.r.t. the transformations (43), hence must be proportional to the above-mentioned 
distribution on Vq given by (45) and (44). Consequently 

hm = Cg/ ^ / / ir{H^''^Go)xr{H^''^G^,d,Q2iA)I{d^eV,)I{d2eV2) 

jRm(p-m) JuP^ JVo 

n-m-l P n-p-1 

xUdi ' li di ' n idJ-d^) n (dj-di) 

i=l i=m+l j<i<m m<j<i 

2 



X exp(-i tr Y: HPG,sD,G',,HP'S:') 

^ s=l 

X etr(^~Q2iQ2i)diii{Gii) d//2(G22) dddq2i, 

where Go is given by (46), and ni, 112 are the invariant probability measures respectively on 0+(m) 
and 0^{p — m). 

Let O^Q^ denote the shce of O^^) by G12 = 0. Since O(^) = Jf ^1/, oj"^ = H^-'^Vq. Consequently 
for each 1 < r < T, 

lim/, ^ cJ I f Lr{Go)xr{Go,d,Q2i,$)I{dieVi)I{d2eV2) 



m 



n — p — l 



xUdi ' U di ' n (dj-di) n (dj-di 

1=1 i=m+l j<i<m m<j<i 

2 



1 1 

X exp(-- tr J2 GssDsG'^,^-^) etr(^--Q2iQ'2i)dni{Gu) d^i2{G22) dd dq2i. 



2 .=1 
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Note that Ur=i Oj"^ = 0{m, p — m) and tr(Go) vanishes on 0{m,p — m) \ 0q'\ Therefore we have 
lim J, = C6 / / / Lr{Go)xr{Go:d,Q2i,^)I{dieVi)I{d2eV2) 



m 



n — p — 1 



i=l i=m+l j<i<m m<j<i 

2 



X exp(-^tr^G,,AG;,S7^) etr(-^Q2iQ2i)(i/^i(Gii) d/X2(G22) dddq2i.{A7) 



2 .=1 



Prom (35), (37) and (47), we have 
hm E[x{G,l,X,a,f3)] 



= ce / ^ / / / xr(Go,d,Q2i,^)I(d^eV,)I(d2eV2) 

jRm(p-m) Jrp^ Jo+ip-m) JO+{m) 

n — m—1 ^ n—p—1 

xY[d, ^ U di ' n id,-d,) n (dj-di) (48) 

j=l i=m+l j<i<m m<j<i 

1 ^ 1 

X exp(--tr^G5,r>,G;,H7^) x etr(--Q2iQ2i)c^A«i(Gii) dn2{G22) dddq2i. 

^ s=\ ^ 

Under the distribution (22) and the spectral decompositions (23), the joint density function of 
(di.Gii) ((^2,^22)) with respect to the product measure of Lebesgue measure on i?™ (i?^^™) 
and the invariant probabiUty measure \x\ (/X2) on 0^{m) {0^{j) — m)) is given by the foUowing 
functions, Fi(di,Gii) (^2(^2,^22)); 

F^iG^udx) = XilHil-^n^i ^ n ((^,-(^^)etr(--Gnr>iG'nHri) 
F2(G22,d2) = i^2|H2|-^ n d,"^ n (rf,-rfi)etr(-iG22l?2G^232 0' 

i=m+l m<j<i<p 

with Ki,K2 as normahzing constants. The density function of Z21 is given by 

^3(2:21) = i^3etr(-^Z2iZ^i), 

where /^s is a normahzing constant. Using Fi(Gii, di), F2(G22, £^2), -^3(^21), we can rewrite the 
right-hand side of (48) as 

C7 / ^ / / / / xr{Go,(di,d2),Z2i,^) 

Jjlm(p-m) Jx>2 JVi Jo+{p-m) JO+{m) 

xFi(Gii, di)F2(G22, d2)i^3(2;2i)c?yUi(Gii) dn2{G22) ddi dd2 dz2i. 
If we consider the special case x{G, I, A, a, /?) = 1, we notice that cy = 1. 
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A. 2 Proof of Lemma 2 

Using Lemma 1, we will calculate 



Let 



lim E[trGL^/^CL^/^G'TA-^rr 
lim E[tT{GL^/^CL^/^G'rA-^T')% 

/3/a— >0 



xi{G,l,\,a,P) = tr{GL^/'^CL^^^G'rA-^T'), 
X2(G,l,X,a,P) = triGL^/^CL^/^G'TA-^T'f, 



then 

x,{rG,l,X,a,P) = jzjlK%c,gl 

i=i i=i 

p p 

< 



maxc,) Y: E K%9ij = 3(maxc,) tr^GLG'A-') 

^ i=ij=i 

< 3(maxCj)etr(-GLG'A" 



i=i j=i 

-GLG'^-^ 

^3 

X2(TG,l,X,a,P) = tr(A-V2GXV2cxV2G;/A-V2)2 



2 . P 



i=i j=i 

< ((maxc,)EEAri/,4)' = {6(maxc,) tr(lGLG'A-i) 



i=i j=i 

< 



{6(max9) ctr(iGLG'A-^)}^ = 36(maxCj)2 etr(-Gi:G'A- 



i ^6 

hence (19) is satisfied for both xi and X2- Now let 

Z, A) = A-V2i3-«GXV2 

for each r. Then we have 

Xi{TH^^'^GJ,X,a,l3) = tr BOB', 
X2{rH^^^G,l,X,a,P) = tr{BCB'f. 

We notice that 

B = A-V2/fMG(u)i^V2 

_ / A^'^'H[^^Gn{u)Ll^' A^'/'h[^^Gu{u)lI^' \ 

- [ Al'I^Ht'U2.LT Atl'H^^^G22{u)LT) 

( E;'/'Hl-^Gu{u)Dy' a-V2^i/23-V2^M^^^(^)^i/2 

- I aV2;5-V23-V2jj(.)fj^^£,i/2 s^^/-Hir)G22iu)Dl/' 

( Er,"^H^:^G,,{u)D\'^ a-'/'(3'/'Si'^'Hl^^Gu{u)Dl^' \ 
~ [ Q21 S^'/'Hi^^G22{u)Dl/' )■ 
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Substitute u{d, q, ^, a, 13) with u in the last matrix and denote it by B{d, q, ^, a, j3). Then 

xi(d,q,|,a,/5;r,/f(^)) = ii B{d,q,i, a, P)CB\d,q, a, (3), 
X2{d,q,ta,f3;T,H^^^) = tr{B{d,q,ta,P)CB'{d,q,ta,P)f. 



Therefore 



hm xi{d,q,$,a,p;T,H^^^) = tiBCB', 

/3/a—*0 

hm X2{d,q,toi,(3;T,H^^^) = tr{BCB')\ 



where 



is given by 



B 



Pi'' I'' 1 = B{d,q,ta,P) 

-t»21 -t»22 / /3/a-»0 



= Hr'/'i?rGn(qii,q22,0)A'^' = Hr'/'(ifMG(qn, ^22, 0))n A'^', 

Si2 = 0, 
B21 — Q21, 

B22 = H-'/'lJi"^G22(9ii, 922, 0)r>2'^' = S-'/'(i?(^)G(gn, ^22, 0))22-D2^', 
because of (18). By straightforward calculation we have 
tr(SCB') = tr BuCiB'^^ + tr Bi2C2-Si2 B21C1B21 + tr B22C2B22 

= ^tr(ijMG(qn,q22,0))..Dy^C, Z^^^^j^M^^^^^^ ^^^^ O^^.^g-i ^ g^^^-iQ^i, 

s=l 



tr(SCB') = tr(CS'S) 



= tr 



Ci{B[^Bii + B21B21) C1B21B22 



!21 



\ C2-B22-B21 C2B22B22 I 

— tT{Ci{B[^Bii + B2iB2i)y + 2tr C1B21B22C2B22B21 + tr(C2-B22-^22)^ 
= tr {CiDy'{H^^^G{qii, ^22, 0));iHr^(i/(^)G(gn, ^22, 0))n A'^' + CiQ^^Qs 
+2 tr (CiQ^iE:2-'/'(i/(^)G(gn, ^22, 0))22£>2^'C2 

x£)^/'(//MG(qrn, ^22, 0))'22S2 '/'Q21) 
+ tr(C2r>2^'(l/^"^G(qn, 922, 0))'22H2-'(Ii'(")G(qn, ^22, 0))22£>2^')'. 



Consequently we have the following results; all the asymptotic expectations below are taken with 
respect to the distributions in (22) and the spectral decompositions (23). 



lim E[trGL^/^CL^/^GTA-^T'] 

/3/a->0 

= E[trGnDl^^CiDy''G[^S^^]+E[trG22Dl^^C2Dl/^G'22S2^] 
+E[tj: Z2iCiZ'2,] 



(49) 
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lim E[tTiGL^/^CL^^''G'TA-^T'Y] 

+2£'[tr ^ G22D2 C2D2 G'22^2 ^ ^21] 

+EMC2Dl/'G',,B^'G22Dl/y] 

= E[tT{Gr,DY'C,Dl^'G'nST')'] 

+2tr£;[CiA'^'GiiSr'GiiA'^'C'i]^[-^2i-^2i] 

+E[tTCi -^21 -^21 C*! -^21 -^21] (50) 
+2tr£;[H2'/'G22A'^'C'2A'^'G2232'^']^[^2lCiZ^l] 

+E[triG22Dl^'C2Dl^'G'22S2'f]- 

We further calculate the expectations related to Z21. It is obvious that 

E[Z2iCiZ'2,] = (tr Ci) E[Z'2^Z2i] = {p - m)Im, (51) 

since {Z2i)ij, 1 < i < p — m, 1 < j < m, are all independently distributed as the standard normal 
distributions. Letting T — (tij) = Z'2^Z2i we have 

m 

E[tTC,Z'2,Z2iC,Z'2,Z2i] = E[tTC^TC^T]=J2cMTC,T)u] 

1=1 

mm mm 

while 

E[tl] = E[{Y,{Z2^UZ2,] 



1=1 S=l 1=1 s=l 



p—m 



We also have 



p—m 

^ £^[E(^2i)?.(^2i)|. + 2 E (-^2l)jii(-^2l)jis(-^2l)j2i('^2l)j2s]- 

(53) 

j=l jl<j2 

= { 1^:;;; (54) 

-^[(-^2l)iii(-^2l)iis('^2l)i2i(-^2l)i2s] = I q' if^^s' (55) 



Substituting (54), (55) into (53), we have 

^ I p — m, a s. ^ ^ 

Consequently from (52) and (56), 

m 

£;[trCiZ^iZ2iCiZ^iZ2i] = (p-m)(p-m + 2)^c^ + 2(p-m) ^ qc,. (57) 

1=1 l<i<s<m 

Substituting (51) and (57) into (49), (50), we have the result. 
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A. 3 Analytic Evaluation of Asymptotic Risk 

We illustrate an analytic calculation of E[di], E[d^], i = 1, . . . ,p, by the case p = 4, m = 1 and 
n{> 4) even. Suppose S ~ W3(n, Ip). Note the density function of Z = (/i, I2, 13) is given by (see 
e.g. Theorem 3.2.18 of Muirhead (1982)) 

Ksin) n ink - h){h - k){l2 - k) exp(-i ^ /,) , 

1=1 ^ i=i 

where u = u{n) = (n — 4)/2, which is an integer, and 

Ks{n) = 7r='/V(2'"^'r(n/2)r((n-l)/2)r(n/2-l)r(3/2)r(l)r(l/2)). 

Let 

Ai = /i - I2, A2 = /2 - h, A3 = ^3- 

The density function /3(A) of A = (Ai, A2, A3) is given by 

/3(A) = i^3(n)A^(A2 + A3)"(Ai + A2 + A3)" 

X AiA2(Ai + A2) exp(-^(Ai + 2A2 + 3A3)) 



Mn) (E P Ar) it E P ^ 'J AI A* A^ 
X (E Ai A^-^) AiA2A^ exp(-^(Ai + 2A2 + 3A3 



2 

' u \ I u \ I u — s' 

1=0 j=0 s=0 t=0 



j=0 

•a 1 u u—s 

Ks{n) E E E E C I ( : I ( " / I Ai+^+^Ar^+*+^A: 



3u—i—s—t 



xexp(-^(Ai + 2A2 + 3A3 



2 

We define a function F^lxi, ^2, X3; n) of nonnegative integers 
Then 

•a 1 u u—s 

F3{xi,X2,X3;n) = -fs^3(n)EEEE 



=0 j=0 s=0 t=0 
X 



X Ar'+*+^''*"' exp(-A2)dA2 
y A3«-^-^-*+^3gxp(--A3)dA3 



X 

-<4.55s(:)(:)Cr) 

^23w— i+a;i+a;3+3 g— 3u+i+s+t— 0:3 — 1 

x(j + s + xi + l)!(i - j + 1 + 2 + X2)! 
x{3u — i — s — t + X3)! 
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Note that for the case p = 4, m — 1, the distributions of dj, i = 1, . . . , 4, in Theorem 5 is given 
as follows; di — Wu ~ xi ^^'^ d2 > > d^ are the ordered eigenvalues of W22 ~ W3(n — 1, /s). 

Using Ai = ^ d^, A2 = ^3 — (^4, A3 = ^4 and F3(xi, X2, X3; n) as above, we can calculate 
h = {bi, . . . ,b4) and A = (aij)i<ij<4 in Theorem 5 as follows; 

bi = E[di] + {p - m) = n + 3, 

62 = £;[(i2] =£^[Ai + A2 + A3] =F3(l,0,0;n- 1) + F3(0,l,0;n-1) 

+F3(0,0,l;n-1), 

63 = ^[4] =^[A2 + A3] = F3(0.hO;n-l) + F3(0,0,l;n-l), 

64 = EN =F[A3] =F3(0,0,l;n-l), 

Oil = + 2(p — m)(ii] + (p — m)(p — m + 2) 

= + 2n + 6n + 15 = + 8n + 15, 

a22 = E[dl]^E[j2A^ + 2 Y: 

i=i i<j<i<3 

= ^3(2, 0, 0; n - 1) + ^3(0, 2, 0; n - 1) + ^3(0, 0, 2; n - 1) 

+2F3(1, 1, 0; n - 1) + 2^3(1, 0, 1; n - 1) + 2^3(0, 1, 1; n - 1), 

033 = ^[^3] =E[Al + Al + 2A2A3] 

= F3(0, 2, 0; n - 1) + ^3(0, 0, 2; n - 1) + 2^3(0, 1, 1; n - 1), 

044 = E[dl]= E[Al]= F{0,0,2;n-1), 

ai2 = a2i = E[rf2] = F3(l,0,0;n-l) + F3(0,l,0;n-l) + F3(0,0,l;n-l), 

ai3 = a3i = EM3] = i^3(0,l,0;n-l) + F3(0,0,l;n-l), 

ai4 = 041 = ii^[(i4] = ^3(0, 0, 1; n — 1), 

O23 — 0,32 = O24 = Q42 = (^34 = (^43 ~ 0. 
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